Mus101 and homologue thereof

ABSTRACT

Plynucleotides encoding a novel Drosophila gene product designated mus101 and homologues thereof as well as mus101 polypeptides are provided. Polynucleotide probes derived from the nucleotide sequence of mus101 and antibodies that bind to mus101 protein are also provided as well as assays for identifying substances that regulate mus101 function.

FIELD OF THE INVENTION

[0001] The present invention relates to Drosophila mus101, a member of the BRCT superfamily. The present invention also relates to the use of mus101 and homologues thereof in assays to identify substances capable of disrupting mus101 function

BACKGROUND TO THE INVENTION

[0002] The first screens for mutagen sensitive mutants were performed in the early '70s (Boyd et al., 1976; Boyd et al., 1981; Henderson et al., 1987; Smith, 1976; Snyder and Smith, 1982, for a review, see Boyd et al., 1987). These screens were made with the objective of recovering mutants that displayed increased sensitivity to the monofunctional alkylating agent methyl methanesulphonate (MMS). The mutagenic scheme used in each of these screens was similar, and was based upon mutagenising flies with ethyl methanesulphonate (EMS) and the subsequent recovery of progeny that were sensitive to MMS. Some screens selected mutants that displayed sensitivity to damaging agents other than MMS. Boyd et al., 1981 for example, screened for sensitivity to MMS, N-acetyl-2-aminofluorene (AAF) and nitrogen mustard (HN2) and Henderson et al., 1987 screened for sensitivity to MMS, AAF, HN2 and γ-radiation.

[0003] The screens carried out to date have led to the identification of 33 mutagen sensitive loci in the Drosophila genome. The distribution of these loci along the 3 major chromosomes are as follows (Lindsley and Zimm, 1992 or in the flybase web site): 10 on the X-chromosome (mus101, mus102, mus105, mus106, mus108, mus109, mus111, mus112, mei-9 and mei-41), 12 on the second chromosome (mus201-mus211 and phr) and 11 on the third chromosome (mus301, mus302, mus304-mus312). mus101 was isolated in one of the first screens for mutants sensitive to MMS (Boyd et al., 1976). Two EMS-induced mus101 alleles were isolated: mus101D^(D1) and mus101^(D2). Homozygous and hemizygous larvae from both alleles are sensitive to MMS, HN2, AAF and γ-rays, but are resistant to UV radiation. Other alleles of mus101 showing different phenotypes were later isolated: mus101^(K451) (fs(1)K451) is female sterile (Komitopoulou et at., 1983; Komitopoulou et al., 1988; Orr et al., 1984) and MMS sensitive (Yamamoto, R. R. and Henderson, D. S., unpublished results); mus101^(sm) and mus101^(lcd) are late larval lethals (Axton, 1990); and mus101^(tsl) is a conditional, temperature sensitive lethal allele (Gatti et al., 1983; Smith et al., 1985).

[0004] The mutant mus101^(D1) is partially deficient in postreplication repair (Boyd and Setlow, 1976; Brown and Boyd, 1981), as shown by a reduction in the molecular weight of the newly synthesised DNA molecules after UV treatment compared to wild type. Non-irradiated cells from this mutant have a slower rate of gain in molecular weight than control, suggesting a defect in DNA replication (Boyd and Setlow, 1976).

[0005] The mutant mus101^(tsl) results in high levels of chromosomal instability, assessed by the presence of mwh clones (Smith et al., 1985). mus101^(tsl) is a hypomorphic allele, since the number of mwh clones is increased in the combination mus101⁵²/Df(1)HA92 (12A6-7, 12D3). The chromosome instability observed genetically is also observed cytologically. Neuroblast cells from homozygous mus101^(tsl) show altered condensation of the heterochromatic region of chromosomes. In this mutant the heterochromatin is abnormally undercondensed, a defect that presents first in the Y chromosome. After 24 hours at the restrictive temperature, heterochromatic regions in the autosomes and X chromosome are also undercondensed and a significant number of cells with broken chromosomes are also seen (Gatti et al., 1983; Smith et al., 1985).

[0006] Similar genetic and cytological phenotypes are observed for mus101^(D2)/Df(1)HA92 and mus101^(D2)/mus101^(tsl) (Smith et al., 1985). Chromosomal instability has also been observed genetically for homozygous mus101^(D2) females (Baker and Smith, 1979).

[0007] The fact that no cytological defect has been observed in ganglia from homozygous mus101^(D2) females (Gatti, 1979) suggests that this is a hypomorphic allele.

[0008] Analysis of neuroblast cells from mus101^(lcd) also show irregularities in the heterochromatic region, represented by undercondensation of pericentric heterochromatin (FIG. 1.1B and C). Chromosome fragmentation is also observed (Axton, 1990).

[0009] The female sterile allele mus101^(k451) was isolated in a screen for mutants affecting eggshell formation (Komitopoulou et al., 1983). Homozygous females for this mutation have reduced viability and produce flaccid eggs, with thin eggshells and small dorsal appendages (Komitopoulou et al., 1983; Komitopoulou et al., 1988).

[0010] The chorion defects observed in mus101^(k451) are due to a reduced level of the major chorion proteins (Komitopoulou et al., 1988). This in turn is due to a reduced amplification of the chorion genes, specially those located on the third chromosome (Orr et al., 1984). The amplification of chorion genes is a specific form of DNA replication that occurs in follicle cells and is essential for rapid eggshell formation. Two major gene clusters undergo amplification: one on the X chromosome, and one on the third chromosome (for review see Orr-Weaver, 1991). mus101⁺ regulates chorion gene amplification in trans (Orr et al., 1984).

[0011] The phenotypes of the various mus101 alleles sumllmarised above suggest a role for the, mus101⁺ gene product in different aspects of chromosome maintenance. Mus101 protein can be involved in the postreplication repair of DNA (mus101^(D1)), DNA replication (mus101^(K451)), chromatin condensation and chromosome stability (mus101^(tsl), mus101^(D2) and mus101^(lcd)). Gatti et al., 1983 suggested that the mutagen-sensitivity and repair-defective phenotypes shown by viable mus101 alleles are secondary consequences of a primary effect on chromosome condensation that either renders chromatin more susceptible to mutagen damage or less available to repair. The opposite is also possible. Defects in DNA replication could lead to the heterochromatin undercondensation observed in some mutants. It is also possible that Mus101 protein is involved in different cellular processes, if it associates with different proteins to perform different tasks. It is evident that the molecular cloning of such an interesting: gene should bring more insight into the function of its encoded protein.

SUMMARY OF THE INVENTION

[0012] We have now identified isolated genomic and cDNA clones encoding mus101, enabling the deduction of the Mus101 amino acid sequence and recombinant expression of Mus101 protein.

[0013] Accordingly, the invention provides a mus101 polypeptide or a homologue thereof. The polypeptide preferably has one or more of the additional features:

[0014] (1) one or more BRCT domains, preferably five or more BRCT domains, more preferably seven BRCT domains;

[0015] (2) a molecular mass of from x to y kDa, as determined by SDS-PAGE;

[0016] (3) binds to topoisomerase IT or a homologue thereof, more preferably Topo II β,

[0017] (4) localises to the spindle poles.

[0018] Preferably the polypeptide is encoded by a cDNA sequence obtainable from a eukaryotic cDNA library, preferably a metazoan cDNA library (such as insect or mammalian) said DNA sequence comprising a DNA sequence being selectively detectable with a Drosophila mus101 nucleotide sequence as shown in SEQ ID No. 1 or a fragment thereof.

[0019] The term “selectively detectable” means that the cDNA used as a probe is used under conditions where a target cDNA of the invention is found to hybridize to the probe at a level significantly above background. The background hybridization may occur because of other cDNAs present in the cDNA library. In this event background implies a level of signal generated by interaction between the probe and a non-specific cDNA member of the library which is less than 10 fold, preferably less than 100 fold as intense as the specific interaction observed with the target cDNA. The intensity of interaction may be measured, for example, by radiolabelling the probe, e.g. with ³²P. Suitable conditions may be found by reference to the Examples.

[0020] The invention also provides the mus101 polypeptide of SEQ ID. 2 and derivatives, variants homologues thereof, polypeptide fragments thereof, as well as antibodies capable of binding the mus101 protein or polypeptide fragments thereof.

[0021] In another aspect, the present invention provides a polynucleotide selected from:

[0022] (a) polynucleotides comprising the nucleotide sequence set out in SEQ ID No. 1 or the complement thereof.

[0023] (b) polynucleotides comprising a nucleotide sequence capable of hybridising to the nucleotide sequence set out in SEQ ID No. 1, or a fragment thereof.

[0024] (c) polynucleotides comprising a nucleotide sequence capable of hybridising to the complement of the nucleotide sequence set out in SEQ ID No. 1 or a fragment thereof.

[0025] (d) polynucleotides comprising a polynucleotide sequence which is degenerate as a result of the genetic code to the polynucleotides defined in (a), (b) or (c).

[0026] Also provided are polynucleotides encoding polypeptides of the invention. All such polynucleotides will be referred to as a polynucleotide of the invention. A polynucleotide of the invention includes a polynucleotide having a sequence as shown in SEQ ID No. 1 and fragments thereof capable of selectively hybridising to the mus101 gene.

[0027] In a further aspect, the invention provides recombinant vectors carrying a polynucleotide of the invention, including expression vectors, and methods of growing such vectors in a suitable host cell, for example under conditions in which expression of a protein or polypeptide encoded by a sequence of the invention occurs.

[0028] In an additional aspect, the invention provides kits comprising polynucleotides, polypeptides or antibodies of the invention and methods of using such kits in diagnosing the presence of absence of mus101 and its homologues, or variants thereof, including deleterious mutants.

[0029] In a further aspect, the present invention provides the use of a mus101 polypeptide or homologue, derivative, variant or fragment thereof in a method of identifying a substance capable of affecting mus101 function. For example, the invention provides the use of a mus101 polypeptide or homologue, derivative, variant or fragment thereof in an assay for identifying a substance capable of increasing the susceptibility of a cell to DNA damaging agents. Other possible mus101 functions for which it may be desired to identify substances which affect such functions include DNA repair and cell cycle regulation.

[0030] In this respect, the invention also provides a method for identifying a substance capable of binding to a mus101 polypeptide or a homologue, derivative, variant or fragment thereof, which method comprises incubating the mus101 polypeptide or homologue, derivative, variant or fragment thereof with a candidate substance and determining whether the substance binds to the mus101 polypeptide or homologue, derivative, variant or fragment thereof

[0031] Also provided is a substance identified by the above methods of the invention. Such substances may be used in a method of therapy, such as in a method of affecting mus101 function, in particular, increasing the susceptibility of a cell to DNA damaging agents.

[0032] The invention also provides a process comprising the steps of:

[0033] (a) performing one of the above methods; and

[0034] (b) preparing a quantity of those one or more substances identified as being capable of binding to a mus101 polypeptide or homologue, derivative, variant or fragment thereof.

[0035] Also provided is a process comprising the steps of:

[0036] (a) performing one of the above method; and

[0037] (b) preparing a pharmaceutical composition comprising one or more substances identified as being capable of binding to a mus101 polypeptide or homologue, derivative, variant or fragment thereof.

[0038] The present invention further provides a method of treating a tumour comprising administering to a patient in need of treatment an effective amount of a polynucleotide, polypeptide or antibody of the invention.

DETAILED DESCRIPTION OF THE INVENTION

[0039] Although in general the techniques mentioned herein are well known in the art, reference may be made in particular to Sambrook et al., Molecular Cloning, A Laboratory Manual (1989) and Ausubel et al., Current Protocols in Molecular Biology (1999) 4^(th) Ed, John Wiley & Sons, Inc.

[0040] A. Polypeptides

[0041] It will be understood that polypeptides of the invention are not limited to polypeptides having the amino acid sequence set out in SEQ. ID. No. 2 or fragments thereof but also include homologous sequences obtained from any source, for example related viral/bacterial proteins, cellular homologues and synthetic peptides, as well as variants or derivatives thereof.

[0042] Thus polypeptides of the invention also include those encoding mus101 homologues from other species including animals such as mammals (e.g. mice, rats or rabbits), especially primates, more especially humans. More specifically, Mus101 homologues included within the scope of the invention for use in assays and methods of treatment include human mus101.

[0043] Thus, the present invention covers variants, homologues or derivatives of the amino acid sequence set out in SEQ ID No. 2 of the present invention, as well as variants, homologues or derivatives of the nucleotide sequence coding for the amino acid sequences of the present invention.

[0044] In the context of the present invention, a homologous sequence is taken to include an amino acid sequence which is at least 15, 20, 25, 30, 40, 50, 60, 70, 80 or 90% identical, preferably at least 95 or 98% identical at the amino acid level over at least 50 or 100, preferably 200, 300, 400 or 500 amino acids with SEQ ID No. 2. In particular, homology should typically be considered with respect to those regions of the sequence known to be essential for protein function rather than non-essential neighbouring sequences. This is especially important when considering homologous sequences from distantly related organisms. Details of particular comparisons between Drosophila mus101 and human TopBP1 are given in the Examples and indicate that amino acid homology (identity) can be as low as about 25 or 30% over the complete sequence or a fragment comprising BRCT domains. Homology may also be considered with respect to the homologous human sequences described below.

[0045] Particularly preferred regions over which to conduct homology comparisons are the BRCT domains amino acids X and/or Y of SEQ ID. No. 2. Another important region is the region showing homology to human treacle protein (TCOF1) region (amino acids 820 to 956 of SEQ I.D. No.2). Preferably homology to this region is at least 25%, more preferably at least 30 or 35%.

[0046] Although homology can also be considered in terms of similarity (i.e. amino acid residues having similar chemical properties/functions), in the context of the present invention it is preferred to express homology in terms of sequence identity.

[0047] Homology comparisons can be conducted by eye, or more usually, with the aid of readily available sequence comparison programs. These commercially available computer programs can calculate %.homology between two or more sequences.

[0048] % homology may be calculated over contiguous sequences, i.e. one sequence is aligned with the other sequence and each amino acid in one sequence directly compared with the corresponding amino acid in the other sequence, one residue at a time. This is called an “ungapped” alignment. Typically, such ungapped alignments are performed only over a relatively short number of residues (for example less than 50 contiguous amino acids).

[0049] Although this is a very simple and consistent method, it fails to take into consideration that, for example, in an otherwise identical pair of sequences, one insertion or deletion will cause the following amino acid residues to be put out of alignment, thus potentially resulting in a large reduction in % homology when a global alignment is performed. Consequently, most sequence comparison methods are designed to produce optimal alignments that take into consideration possible insertions and deletions without penalising unduly the overall homology score. This is achieved by inserting “gaps” in the sequence alignment to try to maximise local homology.

[0050] However, these more complex methods assign “gap penalties” to each gap that occurs in the alignment so that, for the same number of identical amino acids, a sequence alignment with as few gaps as possible—reflecting higher relatedness between the two compared sequences—will achieve a higher score than one with many gaps. “Affine gap costs” are typically used that charge a relatively high cost for the existence of a gap and a smaller penalty for each subsequent residue in the gap. This is the most commonly used gap, scoring system. High gap penalties will of course produce optimised alignments with fewer gaps. Most alignment programs allow the gap penalties to be modified. However, it is preferred to use the default values when using such software for sequence comparisons. For example when using the GCG Wisconsin. Bestfit package (see below) the default gap penalty for amino acid sequences is −12 for a gap and −4 for each extension

[0051] Calculation of maximum % homology therefore firstly requires the production of an optimal alignment, taking into consideration gap penalties. A suitable computer program for carrying out such an alignment is the GCG Wisconsin Bestfit package (University of Wisconsin, U.S.A.; Devereux et al., 1984, Nucleic Acids Research 12:387). Examples of other software than can perform sequence comparisons include, but are not limited to, the BLAST package (see Ausubel et al., 1999 ibid—Chapter 18), FASTA (Atschul et al., 1990, J. Mol. Biol., 403-410) and the GENEWORKS suite of comparison tools. .Both BLAST and FASTA are available for offline and online searching (see Ausubel et al., 1999 ibid, pages 7-58 to 7-60). However it is preferred to use the GCG Bestfit program.

[0052] Although the final % homology can be measured in terms of identity, the alignment process itself is typically not based on an all-or-nothing pair comparison. Instead, a scaled similarity score matrix is generally used that assigns scores to each pairwise comparison based on chemical similarity or evolutionary distance. An example of such a matrix commonly used is the BLOSUM62 matrix—the default matrix for the BLAST suite of programs. GCG Wisconsin programs generally use either the public default values or a custom symbol comparison table if supplied (see user manual for further details). It is preferred to use the public default values for the GCG package, or in the case of other software, the default matrix, such as BLOSUM62.

[0053] Once the software has produced an optimal alignment, it is possible to calculate % homology, preferably % sequence identity. The software typically does this as part of the sequence comparison and generates a numerical result

[0054] The terms “variant” or “derivative” in relation to the amino acid sequences of the present invention includes any substitution of, variation of, modification of, replacement of, deletion of or addition of one :(or more) amino acids from or to the sequence providing the resultant amino acid sequence retains substantially the same activity as the unmodified sequence, preferably having at least the same activity as the polypeptides presented in the sequence listings.

[0055] Polypeptides having the amino acid sequence shown in SEQ I.D. No. 2, or fragments or homologues thereof may be modified for use in the present invention. Typically, modifications are made that maintain the biological activity of the sequence. Amino acid substitutions may be made, for example from 1, 2 or 3 to 10, 20 or 30 substitutions provided that the modified sequence retains the biological activity of the unmodified sequence. Alternatively, modifications may be made to deliberately inactivate one or more functional domains of the polypeptides of the invention. Amino acid substitutions may include the use of non-naturally occurring analogues, for example to increase blood plasma half-life of a therapeutically administered polypeptide.

[0056] Conservative substitutions may be made, for example according to the Table below. Amino acids in the same block in the second column and preferably in the same line in the third column may be substituted for each other: ALIPHATIC Non-polar G A P I L V Polar - uncharged C S T M N Q Polar - charged D E K R AROMATIC H F W Y

[0057] Polypeptides of the invention also include fragments of the fall length sequences mentioned above. Preferably said fragments comprise at least one epitope. Fragments will typically comprise at least 6 amino acids, more preferably at least 10, 20, 30, 50 or 100 amino acids. Preferred fragments comprise functional domains of the full length mus101 polypeptide (such as the BRCT domains described above), for example fragments capable of binding to topoisomerase II. In particular, preferred fragments comprise the seven BRCT domains.

[0058] In a particularly preferred aspect of the invention, the full length human TopBP1 protein sequence described by Yamane et al., 1997, is specifically excluded from the scope of the term “polypeptides of the invention”. The fall length sequences of this protein is available under Accession no. AB019397—gi 3845612. However, it is preferred that mus101 polypeptide homologues within the scope of the invention include polypeptides having less than 99, 98, 95 or 90% homology but more than 30, 40 or 50% homology to the full length sequences set out as Accession no. AB019397.

[0059] Proteins of the invention are typically made by recombinant means, for example as described below. However they may also be made by synthetic means using techniques well known to skilled persons such as solid phase synthesis. Proteins of the invention may also be produced as fusion proteins, for example to aid in extraction and purification. Examples of fusion protein partners include glutathione-S-transferase (GST), 6×His, GAM4 (DNA binding and/or transcriptional activation domains) and P-galactosidase. It may also be convenient to include a proteolytic cleavage site between the fusion protein partner and the protein sequence of interest to allow removal of fusion protein sequences. Preferably the fusion protein will not hinder the function of the protein of interest sequence.

[0060] Proteins of the invention may be in a substantially isolated form. It will be understood that the protein may be mixed with carriers or diluents which will not interfere with the intended purpose of the protein and still be regarded as substantially isolated. A protein of the invention may also be in a substantially purified form, in which case it will generally comprise the protein in a preparation in which more than 90%, e.g. 95%, 98% or 99% of the protein in the preparation is a protein of the invention.

[0061] A polypeptide of the invention may be labelled with a revealing label. The revealing label may be any suitable label which allows the polypeptide to be detected. Suitable labels include radioisotopes, e.g. ¹²⁵I, enzymes, antibodies, polynucleotides and linkers such as biotin. Labelled polypeptides of the invention may be used in diagnostic procedures such as immunoassays to determine the amount of a polypeptide of the invention in a sample. Polypeptides or labelled polypeptides of the invention may also be used in serological or cell-mediated immune assays for the detection of immune reactivity to said polypeptides in animals and humans using standard protocols.

[0062] A polypeptide or labelled polypeptide of the invention or fragment thereof may also be fixed to a solid phase, for example the surface of an immunoassay well or dipstick. Such labelled and/or immobilised polypeptides may be packaged into kits in a suitable container along with suitable reagents, controls, instructions and the like. Such polypeptides and kits may be used in methods of detection of antibodies to the mus101 polypeptides or their allelic or species variants by immunoassay.

[0063] Immunoassay methods are well known in the art and will generally comprise:

[0064] (a) providing a polypeptide comprising an epitope bindable by an antibody against said protein;

[0065] (b) incubating a biological sample with said polypeptide under conditions which allow for the formation of an antibody-antigen complex; and

[0066] (c) determining whether antibody-antigen complex comprising said polypeptide is formed.

[0067] Polypeptides of the invention may be used in in vitro or in vivo cell culture systems to study the role of mus101 and its homologues in disease. For example, truncated or modified mus101 may be introduced into a cell to disrupt the normal functions which occur in the cell. Specific examples may include fragments of mus101 or its homologues which comprise only the BRCT domains, such as BRCT I, II, II, IV, V, VI or VII or any combination thereof. The polypeptides of the invention may be introduced into the cell by in situ expression of the polypeptide from a recombinant expression vector (see below). The expression vector optionally carries an inducible promoter to control the expression of the polypeptide.

[0068] The use of higher eukaryotic, such as insect or mammalian, host cells is expected to provide for such post-translational modifications (e.g. myristolation, glycosylation, truncation, lapidation and tyrosine, serine or threonine phosphorylation) as may be needed to confer optimal biological activity on recombinant expression products of the invention. Such cell culture systems in which polypeptides of the invention are expressed may be used in assay systems to identify candidate substances which interfere with or enhance the functions of the polypeptides of the invention in the cell.

[0069] B. Polynucleotides

[0070] Polynucleotides of the invention include polynucleotides comprising the nucleic acid sequence set out in SEQ ID No. 1 and fragments thereof Polynucleotides of the invention also include polynucleotides encoding the polypeptides of the invention. It will be understood by a skilled person that numerous different polynucleotides can encode the same polypeptide as a result of the degeneracy of the genetic code. In addition, it is to be understood that skilled persons may, using routine techniques, make nucleotide substitutions that do not affect the polypeptide sequence encoded by the polynucleotides of the invention to reflect the codon usage of any particular host organism in which the polypeptides of the invention are to be expressed.

[0071] Polynucleotides of the invention may comprise DNA or RNA. They may be single-stranded or double-stranded They may also be polynucleotides which include within them synthetic or modified nucleotides. A number of different types of modification to oligonucleotides are known in the art. These include methylphosphonate and phosphorothioate backbones, addition of acridine or polylysine chains at the 3′ and/or 5′ ends of the molecule. For the purposes of the present invention, it is to be understood that the polynucleotides described herein may be modified by any method available in the art. Such modifications may be carried out in order to enhance the in vivo activity or life span of polynucleotides of the invention.

[0072] The terms “variant”, “homologue” or “derivative” in relation to the nucleotide sequence of the present invention include any substitution of; variation of, modification of, replacement of, deletion of or addition of one (or more) nucleic acid from or to the sequence. Preferably said variant, homologues or derivatives code for a polypeptide having biological activity, preferably having substantially the same activity as the amino acid sequence shown as SEQ ID No. 2.

[0073] In a particularly preferred aspect of the invention, the full length human TopBP1 nucleotide sequence described by Yamane et at., 1997, is specifically excluded from the scope of the term “polynucleotides of the invention”. The full length polynucleotide sequence is available under Accession no. AB019397—gi 3845612. However, it is preferred that homologues within the scope of the invention include nucleotides having less than 99, 98, 95 or 90% homology but more than 30, 40 or 50% homology to the full length sequences set out in Accession no. AB019397.

[0074] As indicated above, with respect to sequence homology, preferably there is at least 50 or 75%, more preferably at least 85%, more preferably at least 90% homology to the sequences shown in the sequence listing herein. More preferably there is at least 95%, more preferably at least 98%, homology. Nucleotide homology comparisons may be conducted as described above. A preferred sequence comparison program is the GCG Wisconsin Bestfit program described above. The default scoring matrix has a match value of 10 for each identical nucleotide and -9 for each mismatch. The default gap creation penalty is −50 and the default gap extension penalty is −3 for each nucleotide.

[0075] The present invention also encompasses nucleotide sequences that are capable of hybridising selectively to the sequences presented herein, or any variant, fragment or derivative thereof, or to the complement of any of the above. Nucleotide sequences are preferably at least 15 nucleotides in length, more preferably at least 20, 30, 40 or 50 nucleotides in length.

[0076] The term “hybridization” as used herein shall include “the process by which a strand of nucleic acid joins with a complementary strand through base pairing” as well as the process of amplification as carried out in polymerase chain reaction technologies.

[0077] Polynucleotides of the invention capable of selectively hybridising to the nucleotide sequences presented herein, or to their .complement, will be generally at least 70%, preferably at least 80 or 90% and more preferably at least 95% or 98% homologous to the corresponding nucleotide sequences presented herein over a region of at least 20, preferably at least 25 or 30, for instance at least 40, 60 or 100 or more contiguous nucleotides. Preferred polynucleotides of the invention will comprise regions encoding polypeptide domains homologous to the polypeptide domains described above (for example the BRCT domains), preferably at least 70, 80 or 90% and more preferably at least 95% homologous to said regions.

[0078] The term “selectively hybridizable” means that the polynucleotide used as a probe is used under conditions where a target polynucleotide of the invention is found to hybridize to the probe at a level significantly above background. The background hybridization may occur because of other polynucleotides present, for example, in the cDNA or genomic DNA library being screening. In this event, background implies a level of signal generated by interaction between the probe and a non-specific DNA member of the library which is less than 10 fold, preferably less than 100 fold as intense as the specific interaction observed with the target DNA. The intensity of interaction may be measured, for example, by radiolabelling the probe, e.g. with ³²P.

[0079] Hybridization conditions are based on the melting temperature (Tm) of the nucleic acid binding complex, as taught in Berger and Kimmel (1987, Guide to Molecular Cloning Techniques, Methods in Enzymology, Vol 152, Academic Press, San Diego Calif.), and confer a defined “stringency” as explained below.

[0080] Maximum stringency typically occurs at about Tm-5° C. (5° C. below the Tm of the probe); high stringency at about 5° C. to 10° C. below Tm; intermediate stringency at about 10° C. to 20° C. below Tm; and low stringency at about 20° C. to 25° C. below Tm. As will be understood by those of skill in the art, a maximum stringency hybridization can be used to identify or detect identical polynucleotide sequences while an intermediate (or low) stringency hybridization can be used to identify or detect similar or related polynucleotide sequences.

[0081] In a preferred aspect, the present invention covers nucleotide sequences that can hybridise to the nucleotide sequence of the present invention under stringent conditions (e.g. 65° C. and 0.1×SSC-{1×SSC=0.15 M NaCl, 0.015 MNa₃Citrate pH 7.0).

[0082] Where the polynucleotide of the invention is double-stranded, both strands of the duplex, either individually or in combination, are encompassed by the present invention. Where the polynucleotide is single-stranded, it is to be understood that the complementary sequence of that polynucleotide is also included within the scope of the present invention.

[0083] Polynucleotides which are not 100% homologous to the sequences of the present invention but fall within the scope of the invention can be obtained in a number of ways. Other variants of the sequences described herein may be obtained for example by probing DNA libraries made from a range of individuals, for example individuals from different populations. In addition, other viral/bacterial, or cellular homologues particularly cellular homologues found in mammalian cells (e.g. rat, mouse, bovine and primate cells), may be obtained and such homologues and fragments thereof in general will be capable of selectively hybridising to the sequences shown in the sequence listing herein. Such sequences may be obtained by probing cDNA libraries made from or genomic DNA libraries from other animal species, and probing such libraries with probes comprising all or part of SEQ I.D. No 1 under conditions of medium to high stringency. More preferably, the nucleotide sequences of the human TopBP1 protein described by Yamane et al., 1997 (Accession no. AB01939i), or fragments thereof, may be used to identify other primate/mammalian homologues since nucleotide homology between human sequences and mammalian sequences is likely to be higher than is the case for the Drosophila sequence identified herein.

[0084] Similar considerations apply to obtaining species homologues and allelic variants of the polypeptide or nucleotide sequences of the invention.

[0085] Variants and strain/species homologues may also be obtained using degenerate PCR which will use primers designed to target sequences within the variants and homologues encoding conserved amino acid sequences within the sequences of the present invention. Conserved sequences can be predicted, for example, by-aligning the amino acid sequences from several variants/homologues. Sequence alignments can be performed using computer software known in the art. For example the GCG Wisconsin PileUp program is widely used.

[0086] The primers used in degenerate PCR will contain one or more degenerate positions and will be used at stringency conditions lower than those used for cloning sequences with single sequence primers against known sequences. It will be appreciated by the skilled person that overall nucleotide homology between sequences from distantly related organisms is likely to be very low and thus in these situations degenerate PCR may be the method of choice rather than screening libraries with labelled fragments of SEQ I.D. No. 1.

[0087] Alternatively, such polynucleotides may be obtained by site directed mutagenesis of characterised sequences, such as SEQ ID. No 1. This may be useful where for example silent codon changes are required to sequences to optimise codon preferences for a particular host cell in which the polynucleotide sequences are being expressed. Other sequence changes may be desired in order to introduce restriction enzyme recognition sites, or to alter the property or function of the polypeptides encoded by the polynucleotides. For example, further changes may be desirable to represent particular coding changes found in mus101 which give rise to mutant mus101 genes which have lost their regulatory function. Probes based on such changes can be used as diagnostic probes to detect such mus101 mutants.

[0088] Polynucleotides of the invention may be used to produce a primer, e.g. a PCR primer, a primer for an alternative amplification reaction, a probe e.g. labelled with a revealing label by conventional means using radioactive or non-radioactive labels, or the polynucleotides may be cloned into vectors. Such primers, probes and other fragments will be at least 15, preferably at least 20, for example at least 25, 30 or 40 nucleotides in length, and are also encompassed by the term polynucleotides of the invention as used herein.

[0089] Polynucleotides such as a DNA polynucleotides and probes according to the invention may be produced recombinantly, synthetically, or by any means available to those of skill in the art. They may also be cloned by standard techniques.

[0090] In general, primers will be produced by synthetic means, involving a step wise manufacture of the desired nucleic acid sequence one nucleotide at a time. Techniques for accomplishing this using automated techniques are readily available in the art

[0091] Longer polynucleotides will generally be produced using recombinant means, for example using a PCR (polymerase chain reaction) cloning techniques. This will involve making a pair of primers (e.g. of about 15 to 30 nucleotides) flanking a region of the lipid targeting sequence which it is desired to clone, bringing the primers into contact with mRNA or cDNA obtained from an animal or human cell, performing a polymerase chain reaction under conditions which bring about amplification of the desired region, isolating the amplified fragment (e.g. by purifying the reaction mixture on an agarose gel) and recovering the amplified DNA. The primers may be designed to contain suitable restriction enzyme recognition sites, so that the amplified DNA can be cloned into a suitable cloning vector

[0092] Polynucleotides or primers of the invention may carry a revealing label. Suitable labels include radioisotopes such as ³²P or 35S, enzyme labels, or other protein labels such as biotin. Such labels may be added to polynucleotides or primers of the invention and may be detected using by techniques known per se.

[0093] Polynucleotides or primers of the invention or fragments thereof labelled or unlabelled may be used by a person skilled in the art in nucleic acid-based tests for detecting or sequencing mus101 and its homologues in the human or animal body.

[0094] Such tests for detecting generally comprise bringing a biological sample containing DNA or RNA into contact with a probe comprising a polynucleotide or primer of the invention under hybridising conditions and detecting any duplex formed between the probe and nucleic acid in the sample. Such detection may be achieved using techniques such as PCR or by immobilising the probe on a solid support, removing nucleic acid in the sample which is not hybridised to the probe, and then detecting nucleic acid which has hybridised to the probe. Alternatively, the sample nucleic acid may be immobilised on a solid support, and the amount of probe bound to such a support can be detected. Suitable assay methods of this and other formats can be found in for example WO89/03891 and WO90/13667.

[0095] Tests for sequencing mus101 and its homologues include bringing a biological sample containing target DNA or RNA into contact with a probe comprising a polynucleotide or primer of the invention under hybridising conditions and determining the sequence by, for example the Sanger dideoxy chain termination method (see Sambrook et al.).

[0096] Such a method generally comprises elongating, in the presence of suitable reagents, the primer by synthesis of a strand complementary to the target DNA or RNA and selectively terminating the elongation reaction at one or more of an A, C, G or T/U residue; tall, owing strand elongation and termination reaction to occur, separating out according to size the elongated products to determine the sequence of the nucleotides at which selective termination has occurred. Suitable reagents include a DNA polymerase enzyme, the deoxynucleotides dATP, dCTP, dGTP and dTTP, a buffer and ATP. Dideoxynucleotides are used for selective termination.

[0097] Tests for detecting or sequencing mus101, or its homologue, in a biological sample may be used to determine mus101 sequences within cells in individuals who have, or are suspected to have, an altered mus101 gene sequence, for example within cancer cells including leukaemia cells and solid tumours such as breast, ovary, lung, colon, pancreas, testes, liver, brain, muscle and bone tumours.

[0098] In addition, the discovery of mus101 will allow the role of this gene in hereditary diseases to be investigated. In general, this will involve establishing the status of mus101, or its homologue (e.g. using PCR sequence analysis), in cells derived from animals or humans with, for example, tumours.

[0099] The probes of the invention may conveniently be packaged in the form of a test kit in a suitable container. In such kits the probe may be bound to a solid support where the assay format for which the kit is designed requires such binding. The kit may also contain suitable reagents for treating the sample to be probed, hybridising the probe to nucleic acid in the sample, control reagents, instructions, and the like.

[0100] C. Nucleic Acid Vectors

[0101] Polynucleotides of the invention can be incorporated into a recombinant replicable vector. The vector may be used to replicate the nucleic acid in a compatible host cell. Thus in a further embodiment, the invention provides a method of making polynucleotides of the invention by introducing a polynucleotide of the invention into a replicable vector, introducing the vector into a compatible host cell, and growing the host cell under conditions which bring about replication of the vector. The vector may be recovered from the host cell. Suitable host cells include bacteria such as E. coli, yeast, mammalian cell lines and other eukaryotic cell lines, for example insect Sf9 cells.

[0102] Preferably, a polynucleotide of the invention in a vector is operably linked to a control sequence that is capable of providing for the expression of the coding sequence by the host cell, i.e. the vector is an expression vector. The term “operably linked” means that the components described are in a relationship permitting them to function in their intended manner. A regulatory sequence “operably linked” to a coding sequence is ligated in such a way that expression of the coding sequence is achieved under condition compatible with the control sequences.

[0103] The control sequences may be modified, for example by the addition of further transcriptional regulatory elements to make the level of transcription directed by the control sequences more responsive to transcriptional modulators.

[0104] Vectors of the invention may be transformed or transfected into a suitable host cell as described below to provide for expression of a protein of the invention. This process may comprise culturing a host cell transformed with an expression vector as described above under conditions to provide for expression by the vector of a coding sequence encoding the protein, and optionally recovering the expressed protein. Vectors will be chosen that are compatible with the host cell used.

[0105] The vectors may be for example, plasmid or virus vectors provided with an origin of replication, optionally a promoter for the expression of the said polynucleotide and optionally a regulator of the promoter. The vectors may contain one or more selectable marker genes, for example an ampicillin resistance gene in the case of a bacterial plasmid or a neomycin resistance gene for a mammalian vector. Vectors may be used, for example, to transfect or transform a host cell.

[0106] Control sequences operably linked to sequences encoding the polypeptide of the invention include promoters/enhancers and other expression regulation signals. These control sequences may be selected to be compatible with the host cell for which the expression vector is designed to be used in. The term promoter is well-known in the art and encompasses nucleic acid regions ranging in size and complexity from minimal promoters to promoters including upstream elements and enhancers.

[0107] The promoter is typically selected from promoters which are functional in mammalian cells, although prokaryotic promoters and promoters functional in other eukaryotic cells, such as insect cells, may be used. The promoter is typically derived from promoter sequences of viral or eukaryotic genes. For example, it may be a promoter derived from the genome of a cell in which expression is to occur. With respect to eukaryotic promoters, they may be promoters that function in a ubiquitous manner (such as promoters of α-actin, β-actin, tubulin) or, alternatively, a tissue-specific manner (such as promoters of the genes for pyruvate kinase). They may also be promoters that respond to specific stimuli, for example promoters that bind steroid hormone receptors. Viral promoters may also be used, for example the Moloney murine leukaemia virus long terminal repeat (MMLV LTR) promoter, the rous sarcoma virus (RSV) LTR promoter or the human cytomegalovirus (CMV) IE promoter.

[0108] It may also be advantageous for the promoters to be inducible so that the levels of expression of the heterologous gene can be regulated during the life-time of the cell. Inducible means that the levels of expression obtained using the promoter can be regulated.

[0109] In addition, any of these promoters may be modified by the addition of further regulatory sequences, for example enhancer sequences. Chimeric promoters may also be used comprising sequence elements from two or more different promoters described above.

[0110] Polynucleotides according to the invention may also be inserted into the vectors described above in an antisense orientation to provide for the production of antisense RNA. Antisense RNA or other antisense polynucleotides may also be produced by synthetic means. Such antisense polynucleotides may be used in a method of controlling the levels of mus101 or its variants or species homologues.

[0111] D. Host Cells

[0112] Vectors and polynucleotides of the invention may be introduced into host cells for the purpose of replicating the vectors/polynucleotides and/or expressing the polypeptides of the invention encoded by the polynucleotides of the invention. Although the polypeptides of the invention may be produced using prokaryotic cells as host cells, it is preferred to use eukaryotic cells, for example yeast, insect or mammalian cells, in particular mammalian cells.

[0113] Vectors/polynucleotides of the invention may be introduced into suitable host cells using a variety of techniques known in the art, such as transfection, transformation and electroporation. Where vectors/polynucleotides of the invention are to be administered to animals, several techniques are known in the art, for example infection with recombinant viral vectors such as retroviruses, herpes simplex viruses and adenoviruses, direct injection of nucleic acids and biolistic transformation.

[0114] E. Protein Expression and Purification

[0115] Host cells comprising polynucleotides of the invention may be used to express polypeptides of the invention. Host cells may be cultured under suitable conditions which allow expression of the proteins of the invention. Expression of the polypeptides of the invention may be constitutive such that they are continually produced, or inducible, requiring a stimulus to initiate expression. In the case of inducible expression, protein production can be initiated when required by, for example, addition of an inducer substance to the culture medium, for example dexamethasone or IPTG.

[0116] Polypeptides of the invention can be extracted from host cells by a variety of techniques known in the art, including enzymatic, chemical and/or osmotic lysis and physical disruption.

[0117] Polypeptides of the invention may also be produced recombinantly in an in vitro cell-free system, such as the TnT™ (Promega) rabbit reticulocyte system.

[0118] F. Antibodies.

[0119] The invention also provides monoclonal or polyclonal antibodies to polypeptides of the invention or fragments thereof. Thus, the present invention further provides a process for the production of monoclonal or polyclonal antibodies to polypeptides of the invention.

[0120] If polyclonal antibodies are desired, a selected mammal (e.g., mouse, rabbit, goat, horse, etc.) is immunised with an immunogenic polypeptide bearing a mus101 epitope(s). Serum from the immunised animal is collected and treated according to known procedures. If serum containing polyclonal antibodies to an mus101 epitope contains antibodies to other antigens, the polyclonal antibodies can be purified by immunoaffinity chromatography. Techniques for producing and processing polyclonal antisera are known in the art. In order tat such antibodies may be made, the invention also provides polypeptides of the invention or fragments thereof haptenised to another polypeptide for use as immunogens in animals or humans.

[0121] Monoclonal antibodies directed against mus101 epitopes in the polypeptides of the invention can also be readily produced by one skilled in the art. The general methodology for making monoclonal antibodies by hybridomas is well known. Immortal antibody-producing cell lines can be created by cell fusion, and also by other techniques such as direct transformation of B lymphocytes with oncogenic DNA, or transfection with Epstein-Barr virus. Panels of monoclonal antibodies produced against mus101 epitopes can be screened for various properties; i.e., for isotype and epitope affinity.

[0122] An alternative technique involves screening phage display libraries where, for example the phage express scFv fragments on the surface of their coat with a large variety of complementarity determining regions (CDRs). This technique is well known in the art.

[0123] Antibodies, both monoclonal and polyclonal, which are directed against mus101 epitopes are particularly useful in diagnosis, and those which are neutralising are useful in passive immunotherapy. Monoclonal antibodies, in particular, may be used to raise anti-idiotype antibodies. Anti-idiotype antibodies are immunoglobulins which carry an “internal image” of the antigen of the agent against which protection is desired.

[0124] Techniques for raising anti-idiotype antibodies are known in the art. These anti-idiotype antibodies may also be useful in therapy.

[0125] For the pinposes of this invention, the term “antibody”, unless specified to the contrary, includes fragments of whole antibodies which retain their binding activity for a target antigen. Such fragments include Fv, F(ab′) and F(ab′)₂ fragments, as well as single chain antibodies.(scFv). Furthermore, the antibodies and fragments thereof may be humanised antibodies, for example as described in EP-A-239400.

[0126] Antibodies may be used in method of detecting polypeptides of the invention present in biological samples by a method which comprises:

[0127] (a) providing an antibody of the invention;

[0128] (b) incubating a biological sample with said antibody under conditions which allow for the formation of an antibody-antigen complex; and

[0129] (c) determining whether antibody-antigen complex comprising said antibody is formed.

[0130] Suitable samples include extracts tissues such as brain, breast, ovary, lung, colon, pancreas, testes, liver, muscle and bone tissues or from neoplastic growths derived from such tissues.

[0131] Antibodies of the invention may be bound to a solid support and/or packaged into kits in a suitable container along with suitable reagents, controls, instructions and the like.

[0132] G. Assays

[0133] The present invention provides assay that are suitable for identifying substances that bind to mus101 polypeptides (reference to which includes homologues, variants, derivatives and fragments as described above). In addition, assays are provided that are suitable for identifying substances that interfere with mus101 binding to components of the cellular machinery such as topoisomerase II. Such assays are typically in vitro. Assays are also provided that test the effects of candidate substances identified in preliminary in vitro assays on intact cells in whole cell assays.

[0134] Candidate Substances

[0135] A substance that inhibits cell division (including mitosis and/or meiosis) as a result of an interaction with mus101 polypeptides may do so in several ways. It may directly disrupt the binding of mus101 to a component of the spindle apparatus by, for example, binding to mus101 and masking or altering the site of interaction with the other component. Candidate substances of this type may conveniently be preliminarily screened by in vitro binding assays as, for example, described below and then tested, for example in a whole cell assay as described below. Examples of candidate substances include antibodies which recognise mus101 and peptides derived from topoisomerase II polypeptides.

[0136] A substance which can bind directly to mus101 may also inhibit its function in cell division by altering its subcellular localisation thus preventing mus101 and components of the mitotic apparatus from coming into contact within the cell. This can be tested using, for example the whole cells assays described below. Non-functional homologues of mus101 may also be tested for inhibition of mitosis since they may compete with mus101 for binding to components of the mitotic apparatus whilst being incapable of the normal functions of mus101 or block the function of mus101 bound to the mitotic apparatus. Such non-functional homologues may include naturally occurring mus101 mutants and modified mus101 sequences or fragments thereof. In particular, fragments of mus101 which comprise one or more BRCT domains but lack other functional domains may be used to compete with full length mus101 for binding to cellular components.

[0137] Alternatively, instead of preventing the association of the components directly, the substance may suppress the biologically available amount of mus101. This may be by inhibiting expression of the component, for example at the level of transcription, transcript stability, translation or post-translational stability. An example of such a substance would be antisense RNA or double-stranded interfering RNA sequences which suppresses the amount of mus101 mRNA biosynthesis.

[0138] Suitable candidate substances include peptides, especially of from about 5 to 30 or 10 to 25 amino acids in size, based on the sequence of the various domains of Drosophila mus101 described in section A, or variants of such peptides in which one or more residues have been substituted. Peptides from panels of peptides comprising random sequences or sequences which have been varied consistently to provide a maximally diverse panel of peptides may be used.

[0139] Suitable candidate substances also include antibody products (for example, monoclonal and polyclonal antibodies, single chain antibodies, chimeric antibodies and CDR-grafted antibodies) which are specific for mus101. Furthermore, combinatorial libraries, peptide and peptide mimetics, defined chemical entities, oligonucleotides, and natural product libraries may be screened for activity as inhibitors of binding of mus101 to cellular components such as topoisomerase II. The candidate substances may be used in an initial screen in batches of, for example 10 substances per reaction, and the substances of those batches which show inhibition tested individually. Candidate substances which show activity in in vitro screens such as those described below can then be tested in whole cell systems, such as mammalian cells which will be exposed to the inhibitor and tested for inhibition of mitosis or increased sensitivity to DNA damaging agents.

[0140] Mus101 Binding Assays

[0141] One type of assay for identifying substances that bind to Mus101 involves contacting an mus101 polypeptide, which is immobilised on a solid support, with a non-immobilised candidate substance determining whether and/or to what extent the, mus101 polypeptide and candidate substance bind to each other. Alternatively, the candidate substance may be immobilised and the mus101 polypeptide non-immobilised.

[0142] In a preferred assay method, the mus101 polypeptide is immobilised on beads such as agarose beads. Typically this is achieved by expressing the component as a GST-fusion protein in bacteria, yeast or higher eukaryotic cell lines and purifying the GST-fusion protein from crude cell extracts using glutathione-agarose beads (Smith and Johnson, 1988). As a control, binding of the candidate substance, which is not a GST-fusion protein, to the immobilised mus101 polypeptide is determined in the absence of the mus101 polypeptide. The binding of the candidate substance to the immobilised mus101 polypeptide is then determined. This type of assay is known in the art as a GST pulldown assay. Again, the candidate substance may be immobilised and the mus101 polypeptide non-immobilised.

[0143] It is also possible to perform this type of assay using different affinity purification systems for immobilising one of the components, for example Ni-NTA agarose and histidine-tagged components.

[0144] Binding of the mus101 polypeptide to the candidate substance may be determined by a variety of methods well-known in the art. For example, the non-immobilised component may be labelled (with for example, a radioactive label, an epitope tag or an enzyme-antibody conjugate). Alternatively, binding may be determined by immunological detection techniques. For example, the reaction mixture can be Western blotted and the blot probed with an antibody that detects the non-immobilised component. ELISA techniques may also be used.

[0145] Candidate substances are typically added to a final concentration of from 1 to 1000 nmol/ml, more preferably from 1 to 100 nmol/ml. In the case of antibodies, the final concentration used is typically from 100 to 500 μg/ml, more preferably from 200 to 300 μg/ml.

[0146] Interacting proteins including components of multimeric protein complexes involving mus101 may also be identified by, for example, a two-hybrid screen. The two-hybrid system was developed in yeast (Chien et al., 1991, Proc. Natl. Acad Sci USA 88, 9578-9582) and is based on functional in vivo reconstitution of a transcription factor which activates a reporter gene Specifically, a polynucleotide encoding a protein that interacts with mus101 is isolated by: transforming or transfecting appropriate host cells with a DNA construct comprising a reporter gene under the control of a promoter regulated by a transcription factor having DNA a binding domain and an activating domain; expressing in the host cells a first hybrid DNA sequence encoding a first fusion of part or all of mus101 and either the DNA binding domain or the activating domain of the transcription factor; expressing in the host cell a library of second hybrid DNA sequences encoding second fusion of part or all putative mus101 binding proteins and the DNA binding domain or activating domain of the transcription factor which is not incorporated in the first fusion; detecting binding of an mus101 interacting protein to mus101 in a particular host cell by detecting the production of reporter gene product in the host cell; and isolating second hybrid DNA sequences encoding the interacting protein from the particular host cell. Presently preferred for use in the assay a lexA promoter to drive expression of the reporter gene, the lacZ reporter gene, a transcription factor comprising the lexA DNA binding domain and the GAL4 transactivation domain, and yeast host cells.

[0147] Assays for identifying compounds that modulate interaction of mus101 with other proteins may involve: transforming or transfecting appropriate host cells with a DNA construct comprising a reporter gene under the control of a promoter regulated by a transcription factor having a DNA-binding domain and an activating domain; expressing in the host cells a first hybrid DNA sequence encoding a first fusion of part or all of mus101 and the DNA binding domain or the activating domain of the transcription factor; expressing in the host cells a second hybrid DNA sequence encoding part or all of a protein that interacts with mus101 and the DNA binding domain or activating domain of the transcription factor which is not incorporated in the first fusion; evaluating the effect of a test compound on the interaction between mus101 and the interacting protein by detecting binding of the interacting protein to mus101 in a particular host cell by measuring the production of reporter gene product in the host cell in the presence or absence of the test compound; and identifying modulating compounds as those test compounds altering production of the reported gene product in comparison to production of the reporter gene product in the absence of the modulating compound. Presently preferred for use in the assay are a lexA promoter to drive expression of the reporter gene, the lacZ reporter gene, a transcription factor comprising the lexA DNA domain and the GAL4 transactivation domain, and yeast host cells.

[0148] Another type of assay for identifying compounds that modulate the interaction between mus101 and an interacting protein involves immobilising mus101 or a natural mus101 interacting protein, detectably labelling the nonimmobilised binding partner, incubating the binding partners together and determining the effect of a test compound on the amount of label bound wherein a reduction in the label bound in the present of the test compound compared to the amount of label bound in the absence of the test compound indicates that the test agent is an inhibitor of mus101 interaction with the protein. Conversely, an increase in the bound in the presence of the test compared to the amount label bound in the absence of the compared indicates that the putative modulator is an activator of mus101 interaction with the protein.

[0149] Whole Cell Assays

[0150] Candidate substances may also be tested on whole cells for their effect on cell division, including mitosis and/or meiosis. Alternatively, or in addition, candidate substances may be tested for their ability to increase the susceptibility of a cell to DNA damaging agents. Preferably the candidate substances have been identified by the above-described in vitro methods. Alternatively, rapid throughput screens for substances capable of inhibiting cell division, typically mitosis, and/or increasing susceptibility to DNA damaging agents may be used as a preliminary screen and then used in the in vitro assay described above to confirm that the affect is on mus101.

[0151] The candidate substance, i.e. the test compound, may be administered to the cell in several ways. For example, it may be added directly to the cell culture medium or injected into the cell. Alternatively, in the case of polypeptide candidate substances, the cell may be transfected with a nucleic acid construct which directs expression of the polypeptide in the cell. Preferably, the expression of the polypeptide is under the control of a regulatable promoter.

[0152] Typically, an assay to determine the effect of a candidate substance, such as a candidate substance identified by the method of the invention, on cell mitosis comprises administering the candidate substance to a cell and determining whether the substance inhibits mitosis. Techniques for measuring mitosis in a cell population are well known in the art. The extent of mitosis in treated cells is compared with the extent of mitosis in an untreated control cell population to determine the degree of inhibition, if any.

[0153] Typically an assay to determine the effect of a candidate substance, such as a candidate substance identified by the method of the invention, on the susceptibility of cells to DNA damaging agents, comprises administering the candidate substance to a cell, treating the cell with a DNA damaging agent and determining whether the substance increases the susceptibility of the cell to DNA damage. Techniques for treating cells and measuring susceptibility to DNA damage (such as treating cells with X-rays and measuring cell survival) are well known in the art. Typically, the extent of cell death in a treated cell population is compared with the extent of cell death in an untreated control cell population to determine the increase in susceptibility if any. A suitable protocol is described in Errami et al., 1998, Nucl. Acids Res. 26: 4332-4338. DNA damaging agents include MMS, mitomycin C, alkylating agents such as methyl methanesulphonate and ethyl methanesulphonate, bleomycin, X-rays (typically use up to 10 Gy), gamma rays, UV light (typically use up to 20 J/m²)

[0154] The concentration of candidate substances used will typically be such that the final concentration in the cells is similar to that described above for the in vitro assays.

[0155] A candidate substance is typically considered to be an inhibitor of mitosis if mitosis is reduced to below 50%, preferably below 40, 30, 20 or 10% of that observed in untreated control cell populations. Increased sensitivity to DNA damaging agents typically results in a 10-fold, preferably a 100-fold reduction in survival of treated cells relative to untreated cells.

[0156] H. Therapeutic Uses

[0157] Many tumours are associated with rapid and often aberrant mitosis. One therapeutic approach to treating cancer is to inhibit mitosis in rapidly dividing cells. Thus, since mus101 appears to be required for the normal mitotic process, it represents a target for inhibition of cell division, particularly in tumour cells. In addition, since mus101 appears to be involved in post-replicative repair and mus101 mutants show increased susceptibility to DNA damaging agents, disruption of mus101 function may be used in conjunction with chemotherapy and/or radiotherapy to increase the susceptibility of tumour cells to the chemotherapy and/or radiotherapy.

[0158] One possible approach is to express anti-sense mus101 constructs, preferably selectively in tumour cells, to inhibit mus101 function. Another approach is to use non-functional variants of mus101 that compete with mus101 for cellular components of mitosis, resulting in inhibition of mitosis. Alternatively, compounds identified by the assays described above as binding to mus101 may be administered to tumour cells to prevent mus101 function. This may be performed, for example, by means of gene therapy or by direct administration of the compounds. Anti-mus101 antibodies may also be used as therapeutic agents.

[0159] I. Administration

[0160] Substances identified or identifiable by the assay methods of the invention may preferably be combined with various components to produce compositions of the invention. Preferably the compositions are combined with a pharmaceutically acceptable carrier or diluent to produce a pharmaceutical composition (which may be for human or animal use). Suitable carriers and diluents include isotonic saline solutions, for example phosphate-buffered saline. The composition of the invention may be administered by direct injection. The composition may be formulated for parenteral, intramuscular, intravenous, subcutaneous, intraocular or transdermal administration. Typically, each protein may be administered at a dose of from 0.01 to 30 mg/kg body weight, preferably from 0.1 to 10 mg/kg, more preferably from 0.1 to 1 mg/kg body weight.

[0161] Polynucleotides/vectors encoding polypeptide components (or antisense constructs) for use in inhibiting mitosis may be administered directly as a naked nucleic acid construct. They may further comprise flanking sequences homologous to the host cell genome. When the polynucleotides/vectors are administered as a naked nucleic acid, the amount of nucleic acid administered may typically be in the range of from 1 μg to 10 mg, preferably from 100 μg to 1 mg. It is particularly preferred to use polynucleotides/vectors that target specifically tumour cells, for example by virtue of suitable regulatory constructs or by the use of targeted viral vectors.

[0162] Uptake of naked nucleic acid constructs by mammalian cells is enhanced by several known transfection techniques for example those including the use of transfection agents. Example of these agents include cationic agents (for example calcium phosphate and DEAE-dextran) and lipofectants (for example lipofectam™ and transfectam™). Typically, nucleic acid constructs are mixed with the transfection agent to produce a composition.

[0163] Preferably the polynucleotide or vector according to the invention is combined with a pharmaceutically acceptable carrier or diluent to produce a pharmaceutical composition. Suitable carriers and diluents include isotonic saline solutions, for example phosphate-buffered saline. The composition may be formulated for parenteral, intramuscular, intravenous, subcutaneous, intraocular or transdermal administration.

[0164] The routes of administration and dosages described are intended only as a guide since a skilled practitioner will be able to determine readily the optimum route of administration and dosage for any particular patient and condition.

[0165] The invention will now be further described by way of Examples, which are meant to serve to assist one of ordinary skill in the art in carrying out the invention and are not intended in any way to limit the scope of the invention The Examples refer to the Figures. In the Figures:

DETAILED DESCRIPTION OF THE FIGURES

[0166]FIG. 1—Molecular map of the 12B region.

[0167] The region 12B is represented by 150 kb of cloned genomic DNA. The cosmids and phages used to cover the walk in the region 12B are represented in the lower part of the map. Restriction sites for endonucleases EcoRI and BamHI are represented in the upper and lower parts of the map. Important landmarks in the walk are represented. See text for details.

[0168]FIG. 2—Molecular analysis of X-ray induced, DEB-induced and P-element imprecise excision mutants recovered in this work.

[0169] The deletion of the chromosome walk was monitored by the absence of a 6 kb BamHI fragment revealed after hybridization with probe B4.7 in the various mutants.

[0170] (A) DNA from various female mutants balanced over FM7 or Basc and homozygous controls and were cut with BamHI, run in a 1% agarose gel and probed with the fragment B4.7 after alkaline transfer (see Materials and Methods). DNA was analysed from females of the following genotypes: 1) xr16/fm7; 2) deb54/Basc; 3) deb154/Basc; 4) deb118/deb118; 5) deb142/deb142; 6) p205C/FM7; 8) p116D/FM7; 9) p205A/4FM7; 11) p[w⁺, ry⁺]E2/P[w⁺, ry⁺]E2; 13) OregonR/OregonR; 4) FM7/FM7 and 15) Basc/Basc.

[0171] (B) Diagrammatic representation of the chromosome walk in the 12B region. The EcoRI sites are represented in the upper part, and the BamHI sites in the lower. Important features are the localisation of the P-element insertion P[w⁺, ry⁺]E2, the breakpoint of the deficiency Df(1)LCD and the garnet locus. The 4.7 kb BamHI fragment (B4.7) used as a probe is shown in the lower part of the walk.

[0172]FIG. 3—Molecular characterisation of the 12B region. Reduction of the chromosome walk

[0173] (A) DNA from various females mutants balanced over FM7 and homozygous controls were cut with EcoRV, run in a 1% agarose gel and probed with the fragment EV8 after alkaline transfer (see Materials and Methods). DNA was analysed from females of the following genotypes: 1) deb154fm7; 2) p116D/FM7; 3) p205A/FM7; 4) p281A/FM7; 5) p490D/FM7, 6) P[w⁺, ry⁺]E2/P[w+, ry⁺]E2; 7) OregonR/OregonR and 8) FM7/FM7.

[0174] (B) DNA from female mutants p281A and p490D balances over FM7 and homozygous controls as presented in (A) were cut with the enzyme BamHI, run in a 1% agarose gel and probed with the fragment B15 after alkaline transfer (see Materials and Methods). DNA was analysed from females of the following phenotypes 1) p281A/FM7; 2) p490D/FM7; 3) P[w⁺, ry⁺]E2/P[w+, ry⁺]E2; 4) OregonR/OregonR and 5) FM7/FM7.

[0175] (C) Diagrammatic representation of the chromosome walk in the 12B region. The EcoRI sites are represented in the upper part, and the BamHI sites in the lower. Important features are the localisation of the P-element insertion P[w+, ry⁺]E2, the breakpoint of the deficiency Df(1)LCD and the garnet locus. The 8 kb EcoRV fragment (EV8) and the 15 kb BamHI fragment (B115) used as probes are shown in the lower part of the walk. The localisation of the proximal breakpoint of Df(1)p490D is shown by a vertical arrow in the upper part of the walk. The interval where the proximal breakpoint of the Df(1)p116D, Df(1)p205A and Df(1)deb154 lie is represented by an horizontal line, in the upper part of the map.

[0176]FIG. 4—Diagrammatic representation of the limited region of the walk in 12B to illustrate the position of transcription units.

[0177] (A) Northern-blot analysis of 0-4 hour embryos. There genomic probes were used to identify transcription units in the reduced walk: lane 1) E2.9; lane 2) B0.9; and lane 3) B10. Please refer to map in (B) for localisation of the restriction fragments.

[0178] (B) The map of the limited walk of region 12B is represented by the central line. The relevant fragments used to search for RFLPs and transcription units are indicated above the map. The B09 fragment is immediately distal to B10, and is represented by an open box. The lines below the map indicate the isolated transcription units. The arrow indicates the direction of the transcription. Important features are the localisation of the proximal breakpoint of df(1)p490D and the garnet locus. The fragments X9 and B10 (grey) were used as mus101 positive and negative rescue constructs respectively.

[0179] Symbols: E=EcoRI, B=BamHI, X=Xhol, vertical arrow proximal breakpoint of Df(1)p490D.

[0180]FIG. 5—Identification of RFLPs in three strains of mus101

[0181] (A) RFLPs were identified in DNA from three stains of mus101 subjected to Southern-blot following digestion with BamHI. The probe was the B10 fragment shown in (C). Lanes identify DNA from females of the following genotypes: 1) mus101^(k451)/FM3; 2) mus101^(D1)/mus101^(D1); 3) mus101^(D2)/mus101^(D2) 4) mus101^(tsl)/FM7; 5) mus101^(sm)/FM7; 6) mus101^(lcd)/FM6; 7) FM6/FM6; 8) FM7/FM7 and 9) OregonR/OregonR.

[0182] (B) The same blot shown in (A) was probed with the fragment B5.1 showing in (C).

[0183] (C) Diagrammatic representation of the reduced chromosome walk, as shown in FIG. 4.1, with special attention to the localisation of fragment B10.

[0184]FIG. 6—Genomic sequence of mus101

[0185] The genomic sequence of mus101 was obtained using sequence specific primers (see text for details). The non-coding regions (5′ and 3′).are represented by lower case letters. The coding region is represented by upper case letters. Both nucleotide sequence and predicted amino acid sequence are indicated and have different numeration. The potential promoter region has two DRE motifs, represented in blue. The Mus101 predicted protein has seven BRCT domains, represented in green (BRCT I, III, V and VII) and red (BRCT II, IV and VI). The sequence corresponding to cDNAs is underlined. The region where both cDNAs overlap is double underlined. Four polyadenylation signals are represented in bold in the 3′ non-coding region.

[0186]FIG. 7—HCA plot of Mus101 predicted protein.

[0187] HCA plot of the Mus101 predicted protein representing the seven BRCT domains. The standard one-letter code for amino acids is used except for proline (★), glycine (♦), serine (□) and threonine (□) (Callebaut et al., 1997). The Mus101 HCA plot was obtained using the program DRAWHCA, available at the URL http://www.lmcp.jussieu.fr/k˜mornon/.

[0188]FIG. 8—Comparison of the HCA plots (A) and amino acid sequence (B) of BRCT domain I from Drosophila Mus101, human TopBP1, C elegans clone F37D6.1, fission yeast Rad/4Cut5 and mouse Ect2.

[0189] The BRCT domain I was compared between the Drosophila Mus101, human TopBP1, C. elegans clone F37D6.1, fission yeast Rad 4/Cut5 and mouse Ect2 in both HCA plot (A) or in amino acid sequence (B). In the HCA plots, the standard one-letter code for the amino acids is used except for proline (★), glycine (♦), serine (□) and threonine (□) (Callebaut et al., 1997). In “B”, the most conserved amino acids are represented in red, and the conserved amino acids in bold. (C) Schematic representation of the BRCT domains position in Mus101, TopBP1, F37D6.1, Rad4/Cut5 and Ect2. The BRCT domains are represented by blue boxes.

[0190] (C) Comparison of the localisation and distribution of BRCT domains (blue boxes) in Mus101, TopBP1, F37D)6.1, Rad4/Cut5 and Ect2.

[0191]FIG. 9—Pairwise alignment of Drosophila Mus101 predicted protein and human TopBP1.

[0192] Alignment of Mus101 predicted protein with human TopBP1 using th. BLAST2 pairwise alignment program.

[0193] (A) Alignment of the N-terminal to central region of Mus101 and TopBP1. The scores are: score=228 bits (574), expect=3 e⁻⁵⁸ identity=190/748 (25%), positives=323/748 (42%), gaps=84/748 (11%). The Mus101 BRCT domains I, III and VI are represented in green, and the BRCT domains II and IV in red.

[0194] (B) Alignment of the C-terminal region of Mus101 and TopBP1. The scores are: score=102 bits (253), expect e⁻²⁰ identity=67/200 (33%), positives=104/200 (51%), gaps=19/200 (9%). The Mus101 BRCT domain VI is represented in red and domain VII in green.

EXAMPLES

[0195] The first efforts to clone mus101 were carried out in our laboratory (Axton, 1990). First, a more refined cytological localisation of mus101 was achieved by the creation of the Df(1)lcd (12B2-9), allowing mus101 to be placed in the region 12B1,2-12B6. Second, a chromosome walk of 150 kb of genomic DNA was undertaken in this region. Molecular entry into the 12B region was gained using several approaches. The Yolk protein 3 gene (Yp3) was used as a probe to identity cloned genomic segments (as phages and cosmids) from 12B6, the proximal part of the chromosome walk. Microdissection of polytene chromosome bands 12B1,2 and PCR amplification of the DNA recovered provided probes corresponding to the distal part of 12B. Cosmids belonging to contigs 12.2 and 12.6 were joined and their relative orientations established by further chromosome walking (FIG. 1).

[0196] The chromosome walk in the 12B region covers 150 kb of cloned genomic DNA. Important landmarks in the walk are the localisation of the breakpoint of the Df(1)lcd in its distal part and the garnet locus, in its proximal part. The Df(1)lcd is the smallest deficiency, that uncovers mus101. mus101 has been mapped by recombination to 0.1 cM to the distal side of garnet (i.e., in the telomeric direction). Thus, with the knowledge of the position of the Df(1)lcd and garnet in the chromosome walk, the area to be searched for mus101 was reduced to approximately 90 kb of chromosome walk.

[0197] To reduce the chromosome walk further to a manageable level for the purposes of cloning mus101, we have created new deficiencies in the. 12B region. The different methods employed to achieve this goal are presented in Example 1. The localisation and molecular characterisation of mus101 and the analysis of the predicted Mus101 protein are discussed in Example 2.

[0198] Materials and Methods

[0199] Drosophila Techniques

[0200] Drosophila stocks were maintained on cornmeal medium under standard conditions (Ashburner, 1989; Roberts, 1986). Crosses were performed at 25° C. unless mentioned otherwise. Back-up stocks were maintained at 18° C.

[0201] Drosophila Stocks

[0202] All Drosophila stocks, balancer chromosomes and phenotypic markers used in this work, unless otherwise stated, are listed in Lindsley and Zimm, 1992 or in the flybase web site (http://morgan.harvard.edu). A list of the most relevant stocks used is presented in the Table 1. TABLE 1 List of the most relevant Drosophila stocks used in this work. Drosophila stock Reference mus101^(D1) Boyd et al., 1976 mus101^(D2) Boyd et al, 1976 mus101^(sm) A. Schalet, personal communication mus101^(lcd) Axton, 1990 mus101^(tsl) Gatti et al., 1983 mus101^(k451) Perrimon and Gans, 1983 Df(1)LCD(w⁻) Axton, 1990 P[w⁺, ry⁺]E2 Levis et al., 1985 y⁺g⁺na⁺Y A. Schalet, personal communication

[0203] Mutagenesis of Drosophila

[0204] X-Ray Mutagenesis

[0205] Males aged 3-5 days were irradiated with X-rays to achieve a dose of 40-80 Gy. This dose was obtained by exposing the males for 2, 3 or 4 minutes using the following parameters: 150 kV and 5 mA in a Torrex 150D X-ray machine (Astrophysics Research Corporation). After exposure to the X-rays, the males were transferred to bottles containing double the amount of virgin females (usually 30 males to 60 females). The pairs were left to mate for 2 days, at 25° C. and then transferred twice at two-day intervals to bottles containing fresh food.

[0206] DEB Mutagenesis

[0207] Males aged 3-5 days were left to starve for 6-10 h in an empty bottle containing filter paper with some drops of water to maintain the humidity. After starvation, the males were transferred to bottles containing filter paper saturated with 5 mM DEB (Diepoxybutane—Sigma) in a 1% sucrose solution. The males were left to feed for an additional 18 h period at room temperature in a chemical fume hood After treatment, the males were transferred to bottles containing fresh maize food and a two-fold excess of Basc virgin females (usually 30 males to 60 females), and kept at 25° C. The pairs were left to mate for 2 days, and transferred twice at two-day intervals to bottles containing fresh food.

[0208] All mutagen procedures were undertaken in a chemical fume hood. The glassware and solutions that were in contact with the mutagen were neutralised over night with the neutralisation solution (1 M NaOH, 0.5% thioglycolic acid).

[0209] Mutagen Sensitivity Tests

[0210] Five pairs of flies of the desired genotype were left to mate in vials containing maize food (vial 1). After two days the adults were transferred to vials containing fresh food (vial 2) and left to mate for two additional days. The, first vial, containing eggs and first instar larvae, were treated with a solution of MMS (Methylmethane sulfonate—Sigma) diluted in distilled water to a desired concentration (usually 0.1%, unless otherwise mentioned). After 24 h in a chemical fume hood, the vials were transferred to 25° C. and left to develop. The progeny of the flies in vial 2 were left untreated and served as a control. All glassware in contact with the mutagen was neutralised as described in section 2.3.2.

[0211] DNA Techniques

[0212] General Methods

[0213] Isolation of DNA from plasmids in small and large scale was done using a Promega Wizard Plus kit (Promega). λ-gt10 cDNA (Poole et al., 1985) was isolated using a Qiagen Lambda mini kit (Qiagen). λZAP EDNA was isolated using the Uni-ZAP™ XR vector (Stratagene). The restriction endonuclease and ligases were purchased from Boehringer Mannheim, and the reactions were performed following the manufacturer's instructions. DNA was subcloned into the vector pBluescript II (Stratagene). Fragments were recovered from an agarose gel using Geneclean II kit (BIO 101). Restriction endonuclease digestion, gel electrophoresis of DNA and blotting of DNA onto nitrocellulose membranes were performed as described by Sambrook et al., 1989.

[0214] Genomic DNA was extracted from adult females and total RNA was extracted from Oregon-R wild type embryos according to Ashburner, 1989. Southern and Northern blotting were performed according to Sambrook et al, 1989, using high stringency conditions. DNA fragments were labelled using a Random Primed DNA labelling kit (Moeringer Mannheim). All extractions or reactions made from kits were done following the manufacturer's instructions.

[0215] DNA Sequencing

[0216] Sequencing reactions were performed using the ABI PRISM big dye terminator cycle sequencing ready reaction kit (Applied Biosystems) following manufacturer's instructions. The reaction were run in an automated 377 Perkin-Elmer sequencer.

[0217] Mus101 Primers

[0218] Specific primers used to sequence genomic and partial mus101 cDNAs were purchased from Oligosyn, Dundee. The list of primers used is listed in Table 2 TABLE 2 Primers used to sequence the genomic and partial mus101 cDNAs. Sequence Name 5′→3′ X99P1 CCGAAGCTATCGCTAGGT X99P3 AGTCCCACGCGCATGCGA X99P5 GACGATAGCTGCACCCAT X99P6 GTGGCGGCGCTGCTTCGA 101C51P1 GAAGGCGCCGCTGTCGAC B09P1 CGCACCTCTCGGATCTCT B09P2 CCACAGCTTCAAGCAGCC E28P1 AGGCAGGTGATGAGGCAC E28P2 GTCTTCTTGCTGTCCGGC GENX9P1 ACCGTTGGCAACGCTGGC GENX9P2 CGCCATGTGGTCACCGAA GENX9P3 TGTGTGGTGGTGACCAAC GENX9P4 CTCTTCGCTTTGGTTTAG GENX9P5 GCCAGCGTTGCCAACGGT GENX9P6 TTAGCTTCTCACGCTCCT GENX9P7 ACGATGATAGATGCCTCC GENX9P8 TGGCACCAATAAGCCTTG GENX9P9 GAGCTCCTGAACTGACGC GENX9P10 TCGCGATTCGCAGTTCTT GENX9P11 GTCGACAGCGGCGCCTTC GENX9P12 TAGCGTAATGAATTACTA

[0219] P-Element Mediated Germline Transformation

[0220] Flies of the genotype w¹¹¹⁸/w¹¹¹⁸; +/+; 2-3(68C)/+ were collected for 45 minutes at 25° C. on grape juice agar plates. All subsequent operations were performed at 18° C. to delay development of the embryo. Embryos were dechorionated in 3% bleach for 2 minutes, rinsed extensively in tap water and aligned on a cut block of grape juice agar so that the posterior poles protruded over the edge of the block. A slide covered with double side Scotchtape™ was then used to pick up the embryos. Following desiccation for 4 minutes under cold air (hairdryer), embryos were covered in Voltalef 10S halocarbon oil prior to injection for transformation (Karess, 1985). Injection needles (Clarke Electromedical Instruments Ltd) were pulled on a Narishige Scientific Instrument needle puller. Injection of embryos was performed on a Prior micromanipulator using an inverted microscope (Olympus). DNA for injection was centrifuged to remove debris which could block the needle. The constructs were injected at an initial concentration of 0.1-1 mg/ml in injection buffer (5mM KCl, 0.1 mM NaHPO4/Na₂PO₄ pH 7.8) into the posterior pole of the embryo.

[0221] Slides were placed in a humid chamber for 48 hours at 18° C. and hatched larvae transferred to vials containing fresh food. Surviving adults were mated individually with w¹¹¹⁸ virgin females or males and progeny showing a w⁺ phenotype were identified as transformants. The chromosome on which the P-element had been inserted was identified by crossing virgins balanced for the second (w¹¹¹⁸/w¹¹¹⁸; Tft/CyO; +/+) or third chromosomes (w¹¹¹⁸/w¹¹¹⁸; +/+; TM3/TM6c) to w⁺ transformant males. All transformants were kept as homozygous stocks. Table 2.4 shows the transformants obtained from injection of the positive and negative constructs.

[0222] The X9 fragment, a 9 kb XhoI fragment from cosmid 44F7 was subcloned into the XhoI site of the transformation vector pW8 (Klemenz et al., 1987) to form the positive rescue construct p[w⁺, X9]. The B10 fragment, a 10 kb BamHI fragment from cosmid 22F12 was subcloned into the BamHI site of the transformation vector pCaSpeR (Pirrota, 1988) to form the negative rescue construct p[w⁺, B10]. TABLE 3 List of transformant lines obtained after injection of mus101 positive and negative rescue constructs. The chromosome in which the P-element has been inserted is indicated. Construct name Chromosome Construct p[w⁺, X9]C9A 3rd + p[w⁺, X9]C11A X + p[w⁺, X9]D3B 3rd + p[w⁺, X9]D13D 2nd + p[w⁺, X9]H5 2nd + p[w⁺, X9]H5C 3rd + p[w⁺, X9]J10B 2nd + p[w⁺, X9]J14A 2nd + p[w⁺, X9]K8A 2nd + p[w⁺, X9]K8B X + p[w⁺, X9]L4B 3rd + p[w⁺, X9]M1 2nd + p[w⁺, X9]M5B 2nd + p[w⁺, X9]M6 X + p[w⁺, B10]A4D 3rd − p[w⁺, B10]A8A 3rd − p[w⁺, B10]A8E 2nd − p[w⁺, B10]E10A 3rd − p[w⁺, B10]F4A X − p[w⁺, B10]F4B 3rd − p[w⁺, B10]F7A 2nd − p[w⁺, B10]F10E 3rd − p[w⁺, B10]F11A X − p[w⁺, B10]I1A X − p[w⁺, B10]J3C 2nd − p[w⁺, B10]J4A 2nd − p[w⁺, B10]J5B 3rd −

Example 1

[0223] Creation of New Deficiencies in the Region 12B1,2-6

[0224] At the onset of this work, we were faced with the problem of correlating the existing cytogenetic maps that defined the position of mus101 with the 150 kb chromosome walk that had been carried out in the region 12B (Axton, 1990). It was imperative to create new deficiencies in this area not only because these might create new alleles of mus101, but because they might identify part of the chromosome walk not containing the two genes. It would be desirable to create small deficiencies in order to reduce the walk to a manageable size.

[0225] Three approaches were used to create deficiencies in the region 12B: X-ray mutagenesis; chemical mutagenesis using DEB; and imprecise excision of P-elements of which the latter generated useful deficiences. The imprecise excision of P-elements is a powerful means of mutagenesis. When a P-element excises from the chromosome, three events can occur: the excision can be precise, imprecise or it can cause internal rearrangements. We were aiming to obtain imprecise excision of the P-element in which simultaneous loss of adjacent DNA would occur. Such deletions extend from the position of the insertion of the P-element in the chromosome.

[0226] Independent of the source of mutagen used, it is necessary to use a chromosome carrying a phenotypic marker in a region in the vicinity of the region to be mutated. In the case of this work, I have used the strain P[w⁺, ry⁺]E2 (Levis et al., 1985).

[0227] Strain P[w⁺, rv⁺]E2

[0228] The Drosophila strain P[w⁺, ry⁺/E2 cares a P-element in the region 12B1-2 (Levis et al.,1985) This P-element has white (w⁺) and rosy (ry⁺) phenotypic eye color markers. The presence of the w⁺ marker facilitates the analysis of mutated chromosomes, since w mutant and wild-type eye colour can be easily recognised.

[0229] The P-element in the P[w⁺, ry⁺]E2 strain had been localised internally in the 12B region of the chromosome walk. It lies in the distal part of the walk within a 12 kb BamHI fragment found in wild-type (Oregon-R or Cantons-S) DNA. It is detected using a 4.7 kb BamHI fragment from the end of the insert in the cosmid 29E9 (named B4.7). In the strain P[w⁺, ry⁺]E2, this probe recognises a 6 kb BamHI fragment that extends from the internal BamHI site in the P-element to a site in the proximal flanking DNA (FIG. 2B).

[0230] Knowledge of the P-element location at molecular level was essential to analyse the new chromosomal rearrangements generated by mutagenesis. Specifically it facilitates the determination of whether a mutated chromosome has a deletion internally in the region of the walk or not. If the 6 kb BamHI band is present, it means that the chromosome P[w⁺, ry⁺]E2 is still intact in this region. However, if it is absent or altered, a small deletion (or rearrangement) has occurred.

[0231] P-Element Imprecise Excision

[0232] The generation of P-element imprecise excision mutants is a very direct means of mutagenesis. When a P-element excises from its original site, it can carry segments of the chromosome. The amount of DNA deleted and the frequency of this event depends upon the P-element's insertion site. Deficiencies were created in the region 12B by P-element mobilisation. The extent of the deletions were monitored both genetically and by molecular approaches. To this end, males containing the P-element were generated and a transposase source from whose progeny lethal strains were selected with loss of w⁺. We chose to look for lethal mutants, since mus101 has lethal phenotypes. We searched for mutations that would delete one, other, or both of these genes. However, viable mutations with loss of w⁺ can still be generated that delete part of the chromosome walk. These would not be detected, but in any event the molecular analysis of such mutants would be very time consuming and they might prove not to be informative.

[0233] The scheme of the crosses is illustrated below. In the first cross, made en masse, homozygous females carrying the P-element P[w⁺, ry⁺]E2 are crossed with males carrying the transposase, to generate “jump-start males” in the next generation. Three “jump-start males” were crossed with five crm/FM7 virgin females (a stock carrying the lethal mutation cramped (crm), affecting a polycomb group gene in the region 3C Yamamoto et al., 1997 was used as a source of the FM7 balancer chromosome). This cross not only balanced the progeny, but also allowed us to select flies where the transposition event had occurred by loss of the w⁺ marker. The third cross was designed to check lethality. Four P[w⁺, ry⁺]E2*/FM7 females from each G₁ vial were crossed individually to FM7 males. The lines in which homozygous males were not recovered were kept balanced over FM7 for further genetic analysis.

[0234] A total of 1,698 G₂ crosses were analysed, and 23 lethal mutants were recovered. G₀ : P[w⁺,  ry⁺]E2/P[w⁺,  ry⁺]E2(60) ⊗ +/Y; Δ2-3,  Sb/Df(30) G₁ : crm/FM7(5) ⊗ P[w⁺,  ry⁺]E2/Y; Δ2-3,  Sb/ + (3) G₂ : P[w⁺,  ry⁺]E2^(*)/FM7(1); +/ + ⊗FM7/Y(3) Where  (^(*))  means  loss  of  w⁺.Select:  lethal  mutants,  discard  all  viable    ones.Numbers  in  brackets  represent  the  average  number  of  flies  used  per  bottle/vial.  

[0235] Genetic Analysis of P-Imprecise Excision Mutants

[0236] The mutants generated by P-element imprecise excision were analysed in complementation tests with strains mus101^(sm) and dd4^(08.20). Virgin p/FM7 females were crossed with mus101^(sm)/y⁺g⁺na⁺Y and dd4^(08.20)/y⁺g⁺na⁺Y. The recovery (+) or not (−) of the heterozygous combinations is presented in Table 4. These mutants can be divided into two classes: those that complement both mutations tested (p44C, p71A, p100B, p116D, p205A, p205C, p222C, p281A, p352D, p426D and p490D), and those that fail to complement both (p30A, p69A, p189C, p232B, p263C, p244B, p259B, p309B, p327D, p333C, p414C and p514C).

[0237] In order to test if the P mutations that delete the gene mus101 would also delete garnet, an eye colour mutation localised in the proximal part of the walk (FIG. 2B), complementation tests were performed using the garnet strain wy²g⁴. The progeny of these crosses were checked for the presence of garnet (g⁻) or wild type (g⁺) eye colour. The results are given in Table 4.

[0238] All P-element imprecise excision mutants that delete the gene mus101 delete garnet as well. This result indicates that the deletions extend from the distal part of the walk to at least, or beyond the garnet locus. The mus101 locus was mapped by recombination by A. Schalet (Yale University) to about 0.1cM to the distal side of garnet (i.e., in the telomeric direction). Unfortunately these deletions are not informative with respect to mapping the mus101 locus, since they remove a great part of the walk.

[0239] Five strains that complement the mus101 mutations were also tested for their ability to complement garnet. The strains p116D, p205A, p281A, p426D and p490D when heterozygous with garnet have wild-type eye colour (Table 4).

[0240] Molecular Analysis of P-Element Imprecise Excision Mutants

[0241] The eleven mutants that complement the mus101 mutation were analysed at the, molecular level in Southern-blot experiments. DNA from p/FM7 females was cleaved with BamHI and probed with the genomic fragment, B4.7 referred to in section 3.1. The hybridisation patterns of these mutants fall in 2 classes:

[0242] a) No, deletion of the chromosome walk: This class includes mutants p44C, p71A, p110B, p205C, p352D and p426D. Southern-blots of these mutants reveal bands of 6 kb and 12 kb, corresponding to the P[w⁺, ry⁺]E2 and FM7 chromosomes respectively (see mutant p205C as an example, in FIG. 2A, lane 6). One case of chromosome rearrangement is seen in mutant p222C (FIG. 2A, lane 7), which has a 8 kb band instead of a 6 kb derived from the chromosome P[w⁺, ry⁺]E2. In these mutants there is not a deletion of the internal part of the walk. There might be, however, a deletion in the other side of the P-element, causing the loss of the white eye colour and the lethal phenotype. It is also possible that the loss of eye colour is due to an internal rearrangement in the P-element, and the lethality is due to a second site mutation. These mutants were not further characterised.

[0243] b) Deletion of the chromosome walk: In this class of mutants, represented by strains p116D, p205A, p281A and p490D, the 6 kb band present in the strain P[w⁺,ry⁺]E2 was deleted (FIG. 2A lanes 8-11). The DNA of these mutants was further analysed to determine the extend of the deletion in the walk (see below).

[0244] Of the 23 w⁻ lethal mutations obtained by P-element imprecise excision, 4 had a deletion in the walk. It is known, by, genetic analysis, that these 4 mutants (p116D, p205A, p281A and p490D) complement the lethal phenotypes of mutant mus101^(sm). Furthermore, these mutant chromosomes, when heterozygous with mutant wy² g⁴ have wild-type eye colour, indicating that the garnet locus localised in the proximal part of the walk is still intact. It thus can be concluded that these deficiencies extend proximally to a point between the P-element and the garnet locus. The precise localisation of these breakpoints would therefore be of great importance to exclude part of the walk that does not contain the mus101 gene. TABLE 4 Complementation analysis of P imprecise excision mutants. p/FM7 females were crossed with mus101^(sm)/y⁺g⁺na⁺Y, dd4^(08.20)/ y⁺g⁺na⁺Y and wy²g⁴/Y. P imprecise mutant mus101^(sm) wy²g⁴ y⁺g⁺na⁺Y p30A − − − p44C + NT − p69A − − − p71A + NT − p100B + NT − p116D + + − p189C − − − p205A + + − p205C + NT + p222C + NT − p232B − − − p259B − − − p263C − − − p264B − − − p281A + + − p309B − − − p327D − − − p333C − − − p352D + NT − p414C − − − p426D + + − p490D + + − p514C − − −

[0245] Reduction of the Chromosome Walk

[0246] In the previous sections we have presented genetic and molecular data concerning the creation of new deficiencies in the 12B region. Of all mutants recovered (54 in total) using different methods, only five have deleted part of the walk. Among these mutants are the DEB-induced mutant deb154 and the P-element imprecise excision mutants p116D, p250A, p281 and p490D. The following section presents data about the extent of these deletions in the chromosome region represented by the walk.

[0247] Genetic analysis performed earlier (see above) has shown that these mutants complement the mus101 mutation. Thus it could be anticipated that the molecular information I would obtain would delimit the chromosome region containing the mus101 gene in relation to the original P-insertion site of P[w⁺,ry⁺]E2.

[0248] The strategy used to determine the breakpoint of the deficiencies in the mutants deb154, p116D, p205A, p281A and p490D was as follows: DNA was extracted from mutants balanced over FM7, cleaved with different restriction endonucleases and analysed in Southern-blotting experiments probed with several fragments from the genomic walk.

[0249] The first informative genomic, region was an 8 kb EcoRV fragment recovered from cosmid 110D5—referred to as RV8 (FIG. 3C). This is a polymorphic region between the strains P[w⁺, ry⁺]E2, wild-type (OregonR) and FM7 balancer chromosomes. The sizes of fragments that this probe recognises in each one of these strains are: 9 kb, 8 kb and 10 kb, respectively when DNA is cleaved with the enzyme EcoRV (FIG. 3A, lanes 6, 7 and 8).

[0250] DNA from the mutants deb154, p116D, p205A, p281A and p490D and appropriate controls was cleaved with the enzyme EcoRV, blotted and probed with the fragment EV8. The mutants deb]154, p116D and p205A show, after hybridisation, the 9 and 10 kb bands corresponding to the strain P[w⁺, ry⁺]E2 and to the FM7 chromosome (FIG. 3A, lanes 1-3). The mutants p281A and p490D show only the 10 kb band, corresponding to the FM7 chromosome (FIG. 3A, lanes 4 and 5).

[0251] From this analysis it can be concluded that the breakpoint of deficiencies deb154, p116D and p205A lie in the interval between the site of the P-element insertion and the location of the fragment EV8 (FIG. 3C). The precise location of these breakpoints were not determined. The deficiencies in strains p281A and p490D span the area covered by the fragment EV8.

[0252] A second informative genomic fragment was a 15 kb BamHI fragment from cosmid 165E3, referred as B15 (FIG. 3C). DNA was extracted from mutant females over the FM7 balancer and from controls, cut with BamHI, Southern-blotted and probed with the fragment B15.

[0253] DNA from the P[w⁺, ry⁺]E2, OregonR and FM7 chromosomes shows a 15 kb band after hybridisation, as does DNA from the mutant p281A (FIG. 3B, lanes 3, 4, 5, and 1, respectively). DNA from mutant p490D shows an additional 13 kb band, indicating the location of the breakpoint for this mutant (FIG. 3B, lane 2). The breakpoint of the deficiency p490D (Df(1)p490D) is shown by a vertical arrow in FIG. 3C. It is not clear from this blot if the mutant p281A has a deletion that extents into the B15 region, or if the breakpoint lies in between the EV8, and B15 region. No further characterisation was done for the mutant p281A.

[0254] The results presented in this section effectively reduce the molecular limits of the chromosome region that contains the mus101 gene. The region that extends from the P-element in the proximal part of the walk to the region covered by the genomic fragment B15 does not contain the mus101 gene. Therefore, the gene should be located towards the proximal part of the walk from B15.

SUMMARY

[0255] In this example, we have shown the data obtained after mutagenesis of the chromosome P[w⁺, ry⁺]E2 (Levis et al., 1985) using imprecise excision of P-elements. The aim of using this technique was to recover small deficiencies in the 12B region, in order to eliminate DNA represented by segments of the walk area as candidate regions for the molecular localisation of the mus101gene.

[0256] We have recovered 26,458 DEB-treated chromosomes, 10 of which were w⁻. Only one of these mutants, deb154 has a deletion in the 12B walk area, Even though the exact molecular localisation of the breakpoint was not determined, it may be deduced to lie between the P-element insertion point in the walk and the 8 kb EcoRV fragment represented in FIG. 3C. Thus, the deletion in the mutant deb154 is not bigger than 25 kb.

[0257] The mutagenesis using imprecise excision of the P-element proved to be very effective. 23 lethal mutations were obtained in a total of 1,698 w mutants (1.35%), and four lethal mutants have a deletion in the region of the walk. The Df(1)p490D contained the biggest deletion (see section 3.5); the molecular localisation of the proximal breakpoint of this deletion has lead to a significant reduction in the walk from 50 to 60 kb, starting from the P-element.

[0258] Furthermore, genetic analysis showed that the Df(1)p490D complements lethal mutations of mus101. So, the gene of interest is not localised within this 50-60 kb of the walk.

[0259] The distance between the proximal breakpoint in Df(1)p490D and the garnet locus is approximately 30 kb. Recombination studies by A. Schalet (Yale University) have positioned mus101 to about 0.1 cM to the distal side of garnet. Thus, by both molecular and genetic analyses, one can say that mus101 is located in this 30 kb interval. This is indeed the case, and the data concerning the molecular cloning of mus101 will be presented in the following chapter.

Example 2

[0260] Cloning of mus101

[0261] In the previous chapter data has been presented showing the strategy used to limit the region of the chromosome walk in the region 12B that should encompass the mus101 locus. A new deficiency, in the 12B region, named Df(1)p490D, created by P-element imprecise excision has deleted chromosomal DNA represented by approximately 60 kb of the original chromosome walk. The Df(1)p490D complements all phenotypes of mus101, therefore the mus101 locus is outside the region uncovered by this deficiency. However, localisation of the proximal breakpoint of Df(1)p490D, together with genetic mapping of mus101 distal to garnet, has resulted in a reduction of the original walk that should contain mus101 to within a 30 kb interval.

[0262] In this example we will present the strategy used to localise precisely the mus101 gene in mutant alleles utilising restriction fragment length polymorphisms (RFLPs). Partial cDNAs were isolated from different libraries, and the sequence of both genomic DNA and partial cDNAs for mus101 were determined. Upon P-element mediated germline transformation I confirmed that the cloned transcription unit can rescue all mutant phenotypes of mus101.

[0263] Localisation of mus101

[0264] We first wished to determine whether any of the mus101 alleles showed an RFLP in the region extending from the proximal breakpoint of the Df(1)p490D to the gene garnet. This area corresponds to approximately 30 kb of cloned DNA, and is diagrammatically represented in FIG. 4B. Genomic fragments from the distal-most part of this region did not detect RFLPs when used as probes in Southern-blots of restriction endonuclease digested DNA extracted from various mus101 alleles. The most distal fragment tested in these experiments was a 5.1 kb BamHI fragment (B5.1, FIG. 4B). Thus if the distal part of the 30 kb region to the point of this 5.1 kb fragment (B5.1) can be eliminated, the new area to be searched for RFLPs in mus101 mutants is further reduced to approximately 15 kb.

[0265] We have tested all mus101 alleles for RFLPs in this remaining 15 kb of the reduced chromosome walk. We extracted DNA from homozygous females (mus101^(D1), mus101^(D2)), or females heterozygous with a balancer chromosome (mus101^(lcd)/FM6, mus101^(sm)/FM7, mus101^(tsl)/FM7 and mus101^(K451)/FM3) for cleavage with BamHI. After electrophoresis and transfer, the membrane was probed with genomic fragments from the proximal part of the region. The fragment B10 (FIG. 5C) detects a RFLP in the lethal strains mus101^(lcd) and in the mutagen-sensitive strains mus101^(D1) and mus101^(D2) (FIG. 5A, lanes 2 and 3). In balancer chromosome (FM6 or FM7) or wild type (Oregon-R), this probe detects a fragment of 10 kb (FIG. 5A, lanes 7-9). In the strain mus101^(lcd) the B10 fragment reveals two bands: the 10 kb of the balancer chromosome FM6, and a second band, of approximately 12 kb, corresponding to the mus101^(lcd) chromosome (FIG. 5A, lane 6). In the strains mus101^(D1) and mus101^(D2) the fragment B10 identifies only one band, of approximately 9 and 8.5 kb respectively (FIG. 5A, lanes 2 and 3). To determine if the 12 kb band present in the strain mus101^(lcd) is not due to partial cleavage of DNA, the blot was re-probed with the fragment B5.1 (FIG. 5C). As can be observed in FIG. 42B, the probe identifies a band of approximately 5 kb in all lanes.

[0266] The identification of RFLPs in three independent mus101 alleles in a region corresponding to the B10 restriction fragment strongly suggests that mus101 locus is located at this site. The B10 fragment partially overlaps with a 6.5 kb EcoRI fragment (E6.5, FIG. 4B), known to contain the garnet locus suggesting that mus101 and garnet are closely physically linked, consistent with previous recombination mapping data.

[0267] Identification of Transcription Units in the mus101 Area

[0268] Two methods were used to identify transcription units in the area surrounding mus101: Northern-blots and isolation of cDNAs.

[0269] Northern-Blot Analysis

[0270] Total RNA extracted from 0-4 hour embryos was used in the Northern-blot experiments. The blots were probed with genomic fragments E2.9, B0.9 and B10 (FIG. 4B). These are adjacent fragments in the walk and cover an area of about 14 kb. The fragments B0.9 and E2.9 are distal to B10.

[0271] The fragment E2.9. recognised a transcription unit of approximately 3 kb (FIG. 4A, lane 1 and FIG. 4B, transcript A). The transcription unit A is not likely to be mus101, since it is not detected by fragment B10. The fragment B10 identified two transcription units, one of approximately 5 kb and one of approximately 3.5 kb (FIG. 4A, lane 3, FIG. 4B, transcripts B and C, respectively). The fragment B0.9, distal to B10, also recognised the transcription unit B (FIG. 4A, lane 2, FIG. 4B), indicating that it spans the B0.9 and B10 restriction fragments. The transcription unit C is likely to be garnet, which is known to lie within the B10 fragment and which corresponds in size to the embryonic garnet transcription unit. It is therefore likely that transcription unit B corresponds to mus101, since it is detected by fragment B10, the fragment that shows RFLPs in the strains mus101^(lcd), mus101^(D1) and mus101^(D2).

[0272] Screening of an Embryonic λgt10 cDNA Library

[0273] We have screened the 0-3 hour embryonic library constructed in the vector λgt10 (Poole et al., 1985) for mus101 cDNAs. As a probe I used a 9 kb XhoI fragment (X9) which partially overlaps the fragments E2.9, B10 and E6.5 and contains the whole B0.9 fragment (FIG. 4B). With this fragment we would be able to confirm the previous identified transcription units reported above.

[0274] Six clones were isolated (X91, X93, X94, X95, X96 and X99) and proved to have inserted EcoRI fragments that varied from 1.2 to 3 kb. In order to facilitate the molecular analysis of these cDNAs, they were sub-cloned into the EcoRI site of the vector pbluescript (Stratagene), as described in the Materials and Methods.

[0275] The sequence of both ends of the six cDNAs was determined using the commercially available primers, as described in the Materials and Methods. Analysis of the partial DNA sequence revealed that the cDNAs X91, X94, X95 and X96 were overlapping, and they correspond to the sequence of the garnet cDNA. Thus, these four cDNAs represent transcription unit C (garnet) in FIG. 4B.

[0276] The partial sequence of cDNAs X93 and X99 are non-overlapping, and do not correspond to the garnet cDNA. The sequence of the cDNA-X93 does overlap that of a cDNA previously isolated using the fragment E2.9 as probe (data not shown), and is likely to be transcription unit A (FIG. 4B). A homology search using the BLASTX program (Altschul et al., 1997) revealed that the partial sequence of the cDNA X93 is similar to the human clone KIAA0544 (Nagase et al., 1998, Genbank accession number 3043612) and to the C. elegans locus CEC11H1 (Genbank accession number Z70205), identified in the genome project. Although no function has been assigned to these loci, the human KIAA0544 has significant similarity to human cell growth regulators and apoptosis inhibitors (Nagase et al., 1998).

[0277] A BLASTX search using the clone X99 shows similarity with a human cDNA Pagase et al., 1996), which in turn is similar to the S. pombe rad4⁺/cut5⁺ gene (Fenech et al., 1991; Saka and Yanagida, 1993). Mutants for rad4⁺/cut5⁺ show sensitivity to a variety of DNA damaging agents, and it has been shown that the product of the gene rad4⁺/cut5⁺ is essential for S phase, also being important in the replication checkpoint. Due to similar phenotypes presented by mus101 mutants and rad4⁺/cut5⁺ mutants of S. pombe, and the similarities in sequence, the cDNA X99 is a good candidate for mus101. So, the cDNA X99 was renamed to mus101B1. The genomic region encoding this transcript was subsequently shown to rescue mus101. Further discussion concerning the sequence homology will be addressed below.

[0278] The cDNA mus101B1 was sequenced in both directions using oligonucleotide primers designed specifically for its sequence (Materials and Methods). The total length of the mus101B1 cDNA is 1372 bp, and it has an open reading frame (ORF) of 456 amino acids (FIG. 6, underlined). The cDNA mus101B1 is not full length and does not contain the 3′ non-coding region. It corresponds to nucleotides 2870 to 4241 of the mus101 genomic sequence (see below). There is a difference of one nucleotide between the cDNA and genomic sequences at position 4227 (genomic number). In this position there is a cytosine in the genomic and an adenine in the cDNA sequence. This nucleotide substitution is in the third position of the codon, and does not lead to an amino acid change (in both cases the triplet codes for an alanine). This nucleotide substitution must be due to natural polymorphism present in the different wild-type strains used to construct the genomic and cDNA library.

[0279] The expected size of a complete mus101 cDNA is about 4.5 to 5 kb (as judged by the size of transcript B identified by Northern-blot). Since the cDNA mus101B1 is much smaller than the predicted size, We conclude it only represents a partial reverse transcript. The search for a complete cDNA was therefore necessary.

[0280] Screening of Embryonic λZAP cDNA Library

[0281] In order to isolate a complete mus101 cDNA, We screened a λZAP 2-14 hour embryonic cDNA library (Stratagene), using the cDNA mus101B1 as a probe, as described in Materials and Methods. Only one cDNA was recovered, and named mus101B2.

[0282] The cDNA mus101B2 was totally sequenced using oligonucleotide primers specific for its sequence. Both strands were sequenced, as described in Materials and Methods. It has an ORF of 193 amino acids and 548 bp of 3′ non-coding region, including four consensus polyadenylation signals (AATAAA—FIG. 6, bold). The cDNA mus101B2 corresponds to nucleotides 3966 to 4868 of the genomic sequence presented in FIG. 6, underlined As with cDNA mus101B1, the mus101B2 cDNA also differs from the genomic region in position 4227.

[0283] There is a 545 bp overlap between the 3′ end of mus101B1 cDNA and the 5′ end of mus101B2 cDNA (FIG. 6, double underlined). Thus the cDNA mus101B2 is not complete, and only represents an extension to the 3′ end of the mus101B1 cDNA. The cDNA mus101B2 is underlined in FIG. 6, as a continuation of cDNA mus101B1.

[0284] The low number of cDNAs recovered in two independent embryonic libraries, suggests that mus101 is expressed at low levels during this stage of development. A search for a complete cDNA in libraries representing other stages of the fly development is necessary.

[0285] Identification of mus101 Locus by P-element Mediated Germ-Line Transformation

[0286] In order to confirm the cloning of mus101, two genomic constructs were designed for use in P-element mediated germ-line transformation experiments: one expected to be the positive, containing the whole mus101 sequence and promoter region, and a negative control, lacking the 5′ region of mus101 (coding and non-coding regions).

[0287] The genomic fragment X9, a 9 kb XhoI fragment referred to in section 4.2.2 (FIG. 4B) was used as a positive construct. The X9 fragment was cloned in the transformation vector pW8 to form plasmid p[w⁺, X9] (Materials and Methods). This fragment contains the complete mus101 gene and incomplete sequences of transcription unit “A” and garnet. The B10 fragment (FIG. 4B) was subcloned in the transformation vector pCasPeR, to give the negative construct p[w⁺, B10] (Materials and Methods). The fragment B10 is missing approximately 2 kb from the 5′ end of mus101.

[0288] Both constructs were injected into w¹¹¹⁸; Δ2-3(68C)/+ embryos as described in the Materials and Methods. Fourteen independent transformant lines were recovered for the positive construct p[w⁺, X9]; three on the X-chromosome, seven on the second chromosome and four on the third chromosome. Thirteen independent transformant lines were recovered for the negative construct p[w+, B10]; three on the X-chromosome, four on the second and 6 on the third chromosome. These lines were maintained as homozygous stocks (see Materials and Methods).

[0289] To test whether these constructs were able to rescue the phenotypes of several mus101 alleles, complementation tests were performed. Homozygous mus101^(D1) and mus101^(D2) and heterozygous mus101^(lcd)/FM6, mus101^(sm)/FM7c, mus101^(tsl)/FM7c and mus101^(K451)/FM3 virgin females were crossed with transformant males in which the P-element has been inserted in the autosomes (11 independent lines for positive construct p[w⁺, X9] and 10 independent lines for the negative construct p[w⁺, B10]). The mutagen-sensitivity test was performed using a dose of 0.1% MMS, as described in Materials and Methods. In this complementation test the--phenotype analysed was lethality of hemizygous males after mutagen treatment. The permissive temperature of mus101^(tsl) is 17° C. To test the rescue of the temperature sensitive phenotype, all crosses were done at 25° C. The rescue of the mus101 phenotype was monitored by the recovery of hemizygous males with w⁺ eye colour.

[0290] As can be observed in Table 5, the construct p[w⁺, X9] rescues the phenotypes of all mus101 alleles tested, except for the lethality in the allele mus101^(lcd). In contrast, the construct p[w⁺, B10] rescues none of the various mus101 mutant phenotypes. The positive construct p[w⁺, X9] does not rescue the lethal phenotype of the allele mus101^(lcd) due to a lethal second site mutation, localised outside the duplication y⁺g⁺na⁺Y (12A8; 12F).

[0291] It can be concluded that the mus101 gene corresponds to the ORF of the transcription unit B which is initiated in fragment X9, and disrupted in fragment B10. TABLE 5 Positive p[w⁺, X9] and negative p[w⁺, B10] rescue constructs of mus101 tested in complementation crosses with different alleles of mus101. mus101 allele Phenotype tested p[w⁺, X9] p[w⁺, B10] mus101^(D1) MMS sensitivity + − mus101^(D2) MMS sensitivity + − mus101^(K451) MMS sensitivity + − mus101^(lcd) lethality − − mus101^(sm) lethality + − mus101^(tsl) temperature sensitivity + −

[0292] Genomic Sequence of Mus101

[0293] The complete sequence of mus101 was achieved by sequencing the distal 6 kb of the X9 genomic restriction fragment (FIG. 6). The X9 fragment was cloned in the vector; pBluescript, and both ends were sequenced using commercially available primers. The distal part of X9 overlaps with cDNA “A” (620 bp), and the proximal part (about 700 bp) overlaps with cDNA “C”, corresponding to garnet (sequences not shown, diagrammatic representation in FIG. 4B). To extend the sequence from the distal part of X9 to the mus101 complete genomic sequence, I used the primers previously synthesised for sequencing mus101 cDNAs as well as primers designed specifically for the X9 fragment (Materials and Methods). A total of 5977 nucleotides were sequenced, from the distal part of the X9 fragment, extending through mus101 coding sequence to meet up with the its 3′ non-coding region.

[0294] The mus101 genomic sequence predicts an ORF of 1425 amino acids, with a predicted molecular weight of 158 kDa. The predicted protein is composed of 169 strongly basic, 181 strongly acid, 442 hydrophobic and 396 polar amino acids. The calculated isoeletric point is 6.4 and the charge at pH 7 is −8.8. There are three in-frame methionines in the N terminus of the predicted protein, in positions 1, 5 and 7. By the analysis of the nucleotide sequence upstream of the ATGs coding for the methionines in the beginning of the predicted Mus101 protein, it is likely that the first ATG is the translation start site. Of 10 nucleotides, 5 match the consensus sequence (CACAACCAAAATG, Cavener and Ray, 1991), including the most conserved “CA” at positions −4 and −3, respectively, and three other nucleotides match the second best option. The region upstream of the ATG at positions 3 and 7 have three nucleotides identical to the consensus sequence, including the most conserved A at position −3 and four nucleotides match the second best option. The coding region is followed by 548 nucleotides of 3′ non-coding region, including four consensus polyadenylation signals (AATAAA—FIG. 6, bold).

[0295] The 5′ mus101 sequence is represented by 408 bp upstream of the first ATG in FIG. 6. It is likely to contain at least part of the promoter region since it has two DRE (Drosophila DNA replication-related element) motifs, localised at positions −377 to −370 and −352 to −345 with respect to the first ATG (FIG. 6, blue). DRE is a cis-acting positive regulatory element present in the promoters of a variety of genes, including those encoding proteins with functions in DNA replication, transcription, translation, signal transduction and cell cycle (Matsukage et al., 1995). It is represented by the palindromic sequence 5′-TATCGATA. Other genes containing DRE motifs include DNA polymerase a 180-kDa and 73-kDa subunits, PCNA and Cyclin A. It has been shown by both in vivo and in vitro experiments that DRE motifs are required for the activities of promoters of these genes. The activity is mediated by DREF, a factor that associates specifically with DRE sequences.

[0296] The presence of DRE sequences in the promoter of the mus101 gene strongly suggests that these sequences are important for its transcription, as is the case for other genes required for DNA replication.

[0297] The Drosophila PCNA gene promoter contains at least four transcriptional regulatory elements: besides DRE, the regulatory elements E2F, URE (Upstream Regulatory Element) and CFDD (Common regulatory Factor for DNA replication and DREF genes) have been demonstrated to be important. Specific protein factors bind to each one of these regulatory elements. Of these regulatory elements, only CFDD (other than DRE) is found in mus101 promoter region. The CFDD sequence present in mus101 promoter is 5′CGATA, and overlaps with the DRE elements. In the PCNA promoter, the CFDD sequence has been shown to promote transcription independently of DRE, even though they have overlapping recognition sequences.

[0298] Mus101 Predicted Protein has 7 BRCT Domains

[0299] The Mus101 predicted protein has seven regions of repeated BRCT domains. BRCT stands for BRCA1 C-Terminus. The motif was originally noted as a repeated region in the carboxy-terminus of the human ovarian and breast cancer gene BRCA1; in the budding yeast DNA repair and checkpoint protein Rad9; and in a human protein that binds to p53, named 53BP1 (Koonin et al., 1996). The use of more sophisticated computer programs was later able to identify BRCT domains in a larger number of proteins (Altschul et al., 1997; Bork et al., 1997; Callebaut and Mornon, 1997). Among the proteins containing BRCT domains, in addition to those previously reported by Koonin et al., 1996 are: XRCC1, Rad4, Ect2, Rev1, Crb2, Rap1, terminal deoxynucleotidyltransferases (TdT), human DNA ligases III and IV (Bork et al., 1997; Callebaut and Mornon, 1997), and TopBP1, (Yamane et al., 1997). The majority of these proteins are involved in DNA repair and cell cycle checkpoint control.

[0300] For example, Rad9 is a budding yeast protein required to mediate G1 and G2 checkpoint arrest following DNA damage. The human XRCC1 encodes a protein involved in rejoining of DNA single-strand breaks that arise following treatment with alkylating agents or ionising radiation. The fission yeast Rad4/Cut5 protein is necessary for arrest in response to incomplete DNA replication and Crb2 is a protein that interacts with Cut5 and is required for both DNA damage and replication checkpoints.

[0301] A BRCT domain is composed of approximately 100 amino acids with a characteristic hydrophobic profile which is easily identified in a HCA plot (Hydrophobic cluster analysis) (Callebaut and Mornon, 1997). The HCA of the Mus101 predicted protein showing the localisation of the BRCT domains is given in FIG. 7. In fact, the similarities between BRCT domains are more pronounced when the HCA of the protein is examined, rather than in its amino acid sequence. As an example (FIG. 8) similarities are shown between the highly conserved BRCT domain I in Mus101 and TopBP1, C elegans clone F37D6.1, Rad4 and Ect2, by both alignment of the HCA plots (A) and amino acid sequence (B).

[0302] The number and position of BRCT domains vary in different proteins. BRCA1, Rad9, XRCC1 and Crb2 have two domains in their C-termini, Ect2 has two domains in its N-terminus, and Rad⁴/Cut5 and Dpb11 have two domains in their N-termini and two domains in the central part of the protein (Bork et al., 1997; Callebaut and Mornon, 1997). The majority of the proteins described so far have on average two to four BRCT domains, although some have a greater number, like the C. elegans clone F37D6.1, of unknown function, which has 6 domains (Bork et al., 1997; Callebaut and Mornon, 1997). The human TopBP1 protein has eight BRCT domains, the greatest number of domains so far observed in a single protein (Yamane et al., 1997). Mus101 has seven BRCT domains, and its distribution is similar to the TopBP1, with only the central BRCT of TopBP1 missing in Mus101 (FIG. 9). The position of the Mus101 BRCT domains are: BRCT I: 121-205, BRCT II: 217-304, BRCT III: 397-485, BRCT IV: 605-697, BRCT V: 708-792, BRCT VI: 1206-1293 and BRCT VII: 1328-1409. The BRCT domains I, III, V and VI are represented in green and the BRCT domains II, IV and VI in red in FIG. 6. A comparison of the localisation of BRCT domains in Mus101, TopBP1, F37D6.1, rad4/Cut5 and Ect2 is presented in FIG. 8C.

[0303] It has been suggested by Callebaut and Mornon, 1997 and Bork et at., 1997 that the BRCT domains may be the sites for interaction between proteins implicated in the maintenance of the genome in response to damage. The knowledge of such interactions is beginning to bring light upon poorly understood repair mechanisms. Examples are accumulating, showing that BRCT domain containing proteins can participate in different aspects of DNA repair and cell cycle checkpoint control.

[0304] The protein encoded by the breast and ovarian cancer gene BRCA1 interacts in vivo with BARD1 (BRCT-Associated Ring Domain) protein (Wu et al., 1996). BARD1 and BRCA1 have similar structural organisation. Both proteins have ring fingers at their N-termini, and two BRCT domains at their C-termini (Koonin et al., 1996; Wu et al., 1996). The interaction between these proteins is through their N-terminal regions (Wu et al., 1996). The N-terminus of BRCA1 also interacts with the C-terminus of p53, and it is suggested that BRCA1 can enhance p53dependent gene expression acting as a p53 coactivator. Recently, BRCA1 was found to be a component of the RNA polymerase II holoenzyme by its association with RNA helicase A. The interaction between these proteins is made through the BRCA1 C-terminus, where two BRCT domains are present. The interaction between BRCA1 and RNA polymerase II and the requirement of BRCA1 for transcription-coupled repair of oxidative damage reinforces the model that BRCA1 acts as a transcriptional coactivator.

[0305] The N-terminal region of Rad4/Cut5, which has two repeated BRCT domains, interacts with the N-terminus of Crb2 (for Cut5-repeated binding), a protein required for checkpoint arrest induced by UV irradiation and DNA polymerase mutants. The interaction is dependent on the BRCT domain, since a construct containing the amino acid substitution T45M, in the first BRCT domain of Rad4/Cut5 (the same mutation observed in rad4⁺/cut5⁺ mutants) is not able to interact in a two-hybrid system with Crb2. Rad4/Cut5 and Crb2 also interact with Chk, and it has been suggested by these authors that the three proteins may form a checkpoint sensor-transmitter pathway to arrest the cell cycle.

[0306] The human proteins XRCC1 and DNA ligase III are also physically associated and interact in vitro and in vivo. The association between XRCC1 and DNA ligase III is made through the C-terminal region of both proteins, where their BRCT domains are located. There are two distinct forms of DNA ligase m, α and β, that are a result of different mRNA splicing. Only the DNA ligase III α, which contains a BRCT domain in its C-terminus associates with XRCC1. The human protein XRCC1 also interacts with DNA polymerase-β. A possible interaction between XRCC1 and poly (ADP-ribose) polymerase (PARP) has been suggested previously. XRCC1 interacts with PARP through its central region, which contains a BRCT domain. The importance of the BRCT domains in XRCC1 is further highlighted by the identification of mutations in four mutant cell lines of hamster XRCC1 that either alter or disrupt the BRCT domains.

[0307] The DNA repair protein XRCC4, is involved in repair of DNA double-strand breaks and in V(D)J recombination. XRCC4 is a nuclear phosphoprotein and is an effective substrate for DNA-PK in vitro. XRCC4 associates with DNA ligase IV, via the DNA ligase IV C-terminal BRCT domains. Results obtained previously suggest that XRCC4 acts as a molecular bridge to target DNA ligase IV to other components of the DNA non-homologous end-joining apparatus, and implicate DNA ligase IV in the joining of double-strand breaks via the non-homologous end-joining pathway and in the ligation steps of V(D)J recombination.

[0308] Recently a protein (TopBP1) was described that interacts in vitro with the C-terminus of DNA Topoisomerase II (Topo II) β (Yamane et al., 1997). The interaction between DNA Topo II β and TopBP1 (Topoisomerase-IIβ-Binding Protein 1) is made by the C-terminus of TopBP1, where two consecutive BRCT domains are located (Yamane et al., 1997): Although the DNA Topo II is very conserved among different organisms, its C-terminus is quite divergent, suggesting that these segments may, mediate different cellular functions. Since one of the functions of Topo II is to make transient breaks in DNA during its catalytic reaction, and due to the fact that TopBP1 has BRCT domains, Yamane et al., 1997 have suggested that TopBP1 may be involved in repair of DNA-strand breaks caused by failure of the catalytic reactions of Topo II.

[0309] Mus 101 Predicted Protein is Similar to the Juman Protein TopBP1

[0310] A database search using the PSI-blast program (Altschul et al., 1997) on Mus101 predicted protein reveals the highest scores for a human protein first described as having similarity to the fission yeast Rad4/Cut5 gene product (Nagase et al., 1996). This human protein is identical to TopBP1 (Yamane et al., 1997). In addition to TopBP1 and Rad4/Cut5, the proteins showing significant similarity to Mus101 predicted protein are a C elegans clone F37D6.1, an A. thaliana clone T10M13.12, the mouse transforming protein Ect2, and the human DNA repair protein XRCC1.

[0311] All these proteins have BRCT domains. In fact, the regions of highest similarity that these proteins show with Mus101 are in the BRCT domains, with the exception of TopBP1 and the C. Elegans clone F37D6.1, where the regions of similarity extend beyond the BRCT domains. FIG. 9 shows the pairwise combination of Mus101 and TopBP1 using the Blast 2 program (Altschul et al., 1997). Both proteins have similar, size: the Mus101 predicted protein is 1425 amino acids long, and the TopBP1 predicted protein is 1551 amino acids long. The N-terminal and central part of Mus101 has 26% identity and 44% similarity to the N-terminus of TopBP1. This region contains the Mus101 and TopBP1 BRCT domains I, II, III, IV and V. The C-terminus of Mus101 has 35% identity and 55% similarity with the C-terminus of TopBP1. This region contains the Mus101 BRCT domains VI and VII and the TopBP1 BRCT domains VII and VIII. Thus, the region of TopBP1 which contains the BRCT domain VI does not share homology to Mus101.

[0312] BRCT domains vary in number, position and sequence in different proteins. The BRCT domain I seems to be the most conserved domain among different proteins. The BRCT domain I of Mus101 is similar to the BRCT I domains of TopBP1 protein, C. elegans clone F37D6.1, mouse Ect2 and to fission yeast Rad4/Cut5 (FIG. 8).

[0313] The region between amino acids 817 and 1218 of Mus101 protein that does not have similarity to the human TopBP1 contains two distinct sub-regions with similarity to other proteins. The amino acid sequence 820-956 of Mus101 has 27% identity and 46% similarity to the C-terminus of human treacle protein (TCOF1), a putative nucleolar trafficking phosphoprotein which is defective in patients with Treacher Collins syndrome, a craniofacial developmental disease (Wise et al., 1997). The amino acid sequence 949 to 1130 of the Mus101 predicted protein is 26% identical (38% similar) to the C-terminus of Drosophila posterior sex combs (Psc) protein. Psc is a product of one of the Polycomb group of genes (Pc-G). Pc-G genes are needed to maintain expression patterns during Drosophila development, by the repression of target genes (for review see Pirrota, 1995).

[0314] Discussion

[0315] mus101 mutations show a wide range of phenotypes. There are three lethal, one female sterile and two mutagen sensitive alleles. Analysis of brains of the late lethal alleles mus101^(tsl) and mus101^(lcd) show an abnormal undercondensation of heterochromatic regions (Axton, 1990; Gatti et al., 1983). The mutagen sensitive allele mus101^(D1) is defective in post-replication repair (Boyd and Setlow, 1976). The female sterile allele mus101^(K451) is defective in chorion protein gene amplification (Komitopoulou et al., 1983; Komitopoulou et al., 1988; Orr et al., 1984), a form of differential replication of gene clusters in follicle cells which occurs during oogenesis (Spradling, 1981; Spradling and Mahowald, 1980). The analysis of these phenotypes suggests that the wild type Mus101 protein might be involved in different aspects of chromatin structure, DNA replication and repair. However it may be difficult to distinguish cause and consequence. Is failure in DNA replication responsible for the undercondensed chromatin observed in the alleles mus101^(tsl) and mus101^(lcd), or is the failure of proper condensation responsible for DNA replication problems? We approached the cloning of mus101 to try to elucidate these questions, hoping that this would permit a molecular analysis of the gene product.

[0316] We have now presented data regarding the localisation of the mus101 gene to a 10 kb BamHI fragment (B10) in the proximal part of cloned DNA fragments from a chromosome walk (FIG. 5). This enabled us to isolate partial cDNAs corresponding to the mus101 transcription unit, to prove by germ line transformation that this was the mus101 gene and to determine its genomic sequence (FIG. 6).

[0317] The genomic sequence of mus101 is 5276 bp long and predicts a single ORF of 1425 amino acids. I cannot be certain of the precise position of the N-terminus of the protein since there are three methionines at positions +1, +3 and +7. It is possible that the ATG corresponding to the methionine at position +1 is the one used to initiate translation, since the region upstream of it has more similarities to the consensus sequence for translation initiation in Drosophila. It is also possible, that the ATG corresponding to the methionines at positions +3 and +7 are used to initiate translation, maybe in a less effective way, since the region upstream of these ATGs has considerable similarity to the Cavener consensus sequence. I have assigned the Met at position +1 as a potential N-terminal residue. To definitively identify the initiation methionine it would be necessary to perform N-terminal sequencing of the purified Mus101 protein. A definitive indication of gene organisation will necessitate the recovery of a full length cDNA or performing primer extension experiments. The low number of cDNAs recovered in the screening of embryonic libraries, suggests that mus101 or is expressed at low levels during this stage of development. However, mus101 is expressed at high levels in the ovarian tissue, since I have recently recovered dozens of mus101 clones from an ovarian cDNA library (data not shown). The characterisation of these clones is under way.

[0318] Two DRE sequences in the region 350-380 nucleotides upstream of the first methionine (FIG. 6, blue) suggest that this region might be part of the promoter. DRF sequences are found in the promoter of genes essential for DNA replication, such as PCNA and DNA polymerase cc (subunits 73 kDa and 180 kDa). It is not surprising to find replication related sequences in the mus101 promoter, since mus101 is implicated in DNA replication from the analysis of the mutant phenotype of the female-sterile allele mus101^(K451). Promoter activity assays need to be undertaken to determine if these DRE sequences are functional.

[0319] Multiple promoter elements are necessary for the efficient transcription of the Drosophila PCNA gene: the DRE, URE, E2F and CFDD elements and their specific binding factors, DREF, UREF, E2F/DP and CFDD, respectively. Recognition sequences for E2F, DRE and CFDD were also identified in the promoters of the genes encoding the Drosophila DNA polymerase α 180 kDa and 73 kDa subunits. By comparison, the potential promoter region of mus101 that I have sequenced has two sequences for CFDD motifs, which overlap the DRE sequences. The observed differences between the promoter region of mus101 and the promoter regions of PCNA and DNA polymerase α 180 kDa and 73 kDa subunits genes could mean either that insufficient upstream DNA of mus101 has been sequenced or that mus101 needs alternative factors to activate its transcription. To fully understand the transcriptional activation of genes involved in DNA replication, the promoters from a larger number of genes involved in DNA replication should be analysed.

[0320] The presence of seven BRCT domains in the predicted Mus101 coding region (FIGS. 6 and 7) places the gene into the BRCT superfamily, associated with DNA repair and maintenance of the genome (Bork et al., 1997; Callebaut and Mornon, 1997).

[0321] The Mus101 predicted protein has highest scores of similarity to TopBP1, a human protein that binds to Topo II β (Yamane et al., 1997) and which has eight BRCT domains (FIG. 9). Mus101 predicted protein is also significantly similar to the fission yeast protein Rad4/Cut5, to the human XRCC1, the mouse oncoprotein Ect2 and to the C. elegans clone F37D6.1. The regions with similarity between Mus101 predicted protein and Rad4/Cut5, XRCC1 and Ect2 are restricted to the BRCT domains. The Mus101 predicted protein has regions of similarity that extend beyond the BRCT domains in the human TopBP1 and the C elegans clone F37D6.1. The similarities between Mus101 predicted protein and TopBP1 lie in the N-terminal and central regions and in the C-terminal region (FIG. 9). One region of Mus101 which is not similar to TopBP1 does share some similarity to the human TCOF1 protein and to the Drosophila Psc protein. However, as no specific functional domains have been identified within these regions, the significance of this is still to be determined.

[0322] Since the similarity between the predicted Mus101 protein and TopBP1 is so extensive, it is possible that Mus101 is a functional homologue of TopBP1. The C-terminal region of TopBP1 that interacts with Topo II β has 35% identity and 55% similarity with the predicted Mus101 protein (FIG. 8B).

[0323] DNA topoisomerases II catalyse topological changes in DNA to enable decatenation to facilitate the completion of DNA replication, and chromatid separation. Failure of these processes would lead to defects in chromatin organisation. In Drosophila, DNA Topo II is present in at least three separate functional pools: one for chromosome condensation, one for chromosome segregation and one pool which remains associated with the chromosome throughout the cell cycle. Recently, the Drosophila Topo II was identified as a component of the chromatin-remodelling factor CHRAC (CHRomatin-Accessibility Complex, Vargas-Weiz et al., 1997), suggesting a structural role for this protein in chromatin organisation. DNA Topo II is associated in Drosophila with Barren, a protein necessary for chromatin segregation at anaphase (Bhat et al., 1996).

[0324] The structure of DNA topoisomerases is conserved through evolution. The N-terminal region is the ATPase functional domain, and the central region is the DNA breakage and reunion activity domain. The C-terminus of DNA topoisomerases is the least conserved region, and may specify differences in different isoenzymes. Although the C-terminal regions of Topo II are not conserved at the amino acid level, this region is extremely hydrophilic and charged among eukaryotic enzymes (Wyckoff et al., 1989). It is the C-terminal region of the human Topo II β that interacts with TopBP1.

[0325] It is possible that Mus101 is a functional homolog of TopBP1, and that Mus101 also binds to DNA Topo II. It might be also possible that the putative interaction between Mus101 and Topo II is specific to heterochromatic regions. Once this interaction is disrupted, the heterochromatic regions fail to condense properly, the phenotype observed in mus101 mutants. If true, this could explain the altered condensation of heterochromatic regions observed in mitotic cells from ganglia in the lethal lines mus101^(lcd) and mus101^(ts).

[0326] To address this question, antibodies that identify the Mus101 protein could be used to determine the subcellular localisation of the Mus101 protein, and determine if it matches with the distribution of Topo II, and to determine the distribution of the protein in the mutants mus101^(lcd) and mus101^(ts). Such antibodies would also enable immunoprecipitation experiments to investigate direct interaction between Topo II and Mus101 protein. If physical interaction can be established between the proteins, than it will be possible to determine whether this can modulate Topo II activity.

[0327] The Mus101 BRCT domains I and II are very similar to the BRCT domains I and II of the oncogene product Ect2 and the fission yeast replication checkpoint protein Rad4/Cut5 (FIG. 8), regions that are important to the roles of these proteins in cell cycle control. Truncation of the N-terminal region of Ect2, to remove these two BRCT domains, increases its transforming activity, suggesting that this region has a negative effect on cell division. The N-terminal domain of Rad4/Cut5 containing these BRCT domains is essential for complementation of the temperature sensitive phenotype in cut5 mutants. Moreover, its overexpression blocks cell division (Saka and Yanagida, 1993). The localisation of the mutations in independent rad4⁺/cut5⁺ alleles reveals a consistent amino acid substitution at the 45th codon, in the conserved stretch VTHLIA in BRCT domain I. This amino acid substitution prevents the interaction of the Rad4/Cut5 protein with the DNA replication and DNA damage checkpoint protein Crb2.

[0328] mus101 and rad4⁺/cut5⁺ mutants have similar phenotypes. Both are mutagen sensitive and are defective in DNA repair. They also both encode members of the BRCT superfamily, with similar BRCT domains I and II. The Rad4/Cut5 gene product is a component of the replication checkpoint control system) and together with Crb2 forms a complex with the checkpoint protein Chk1. It has been suggested that these proteins may form a checkpoint sensor-transmitter pathway to arrest cell cycle.

[0329] mus101 may be necessary for the DNA replication checkpoint. This could be assessed by in vivo and in vitro experiments to assess the response of mus101 alleles to drugs that block DNA replication. In S. pombe, the replication checkpoint is categorised into two classes: one resulting from a defect in the replication machinery (mutations in polymerases and ligases), and the other caused by the limitation of normal nucleotide supply (HU and cdc22 mutation). Rad4/Cut5 is necessary for both checkpoints, whereas Crb2 is necessary only for the nucleotide supply checkpoint. It would be interesting to investigate if mus101 is necessary for any of these replication checkpoints. The mutant mus101^(D1) has been tested for sensitivity to HU (Banga et al., 1986). This study revealed that the DNA replication checkpoint appears to be still intact in this mutant. It is important to check the HU sensitivity of other mus101 alleles, such as mus101^(tsl) and the female sterile allele mus101^(k451) and to determine whether the checkpoint caused by a defect in the replication machinery is present.

[0330] A number of proteins belonging to the BRCT superfamily are involved in different aspects of DNA repair. For example, the human protein XRCC1 is necessary for repair of single strand breaks and the protein XRCC4 is necessary for double strand break repair and for V(D)J recombination The study of the interaction of these proteins with other proteins, that may or not contain BRCT domains, offers possibilities to elucidate these repair mechanisms.

[0331] XRCC1 interacts with DNA ligase m, PARP and DNA polymerase β 1996. Although the mechanism of action of these proteins is not understood, it seems reasonable that the four proteins may form a complex that acts in base excision repair. XRCC4 interacts with DNA ligase IV, and it has been suggested that the function of the XRCC4DNA ligase IV complex may be to carry out the final steps of V(D)J recombination and joining of DNA ends.

[0332] The mutagen sensitive allele mus101^(D1) is partially defective in postreplication repair (Boyd and Setlow, 1976). The mechanism of postreplication repair is not well understood in higher eukaryotes. It may be possible to try to elucidate the process of postreplication repair by identifying proteins that interact with mus101, determining the sites of such interactions, and by analysing the nature of the mus101^(D1) mutation.

[0333] References

[0334] Altschul, S. F., Madden, T. L., Schaffer, A., Zhang, J., Zhang, Z., Miller, W., and Lipman, D. J. (1997). Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res., 25: 3389-3402.

[0335] Ashburner, M. (1989). Drosophila, a laboratory manual (Cold Spring Harbour: Laboratory Press).

[0336] Axton, M. (1990). Genetic and molecular analysis of mitotic chromosome condensation in Drosophila. Ph.D. Thesis (London, Imperial College of Science and Technology).

[0337] Baker, B. S., and Smith, D. A. (1979). The effects of mutagen-sensitive mutants of Drosophila melanogaster in nonmutagenized cells. Genetics, 92: 833-847.

[0338] Banga, S. S., Shenkar, R., and Boyd, J. B. (1986). Hypersensitivity of Drosophila mei-41 mutants to hydroxyurea is associated with reduced mitotic chromosome stability. Mutation Res., 163: 157-165.

[0339] Bhat, M. A., Philp, A. V., Glover, D. M., and Bellen, H. J. (1996). Chromatid segregation at anaphase requires the barren product, a novel chromosome-associated protein that interacts with Topoisomerase II. Cell, 87: 1103-1114.

[0340] Bork, P., Hofmann, K., Bucher, P., Neuwald, A. F., Altschul, S. F., and Kooning, E. V. (1997). A superfamily of conserved domains in DNA damage-responsive cell cycle checkpoint proteins. FASEB J, 11: 68-76.

[0341] Boyd, J. B., Golino, M. D., Nguyen, T. D., and Green, M. M. (1976). Isolation and characterization of X-linked mutants of Drosophila melanogaster which are sensitive to mutagens. Genetics, 84: 485-506.

[0342] Boyd; J. B., Golino, M. D., and Setlow, R. B. (1976). The mei-9a mutant of Drosophila melanogaster increases mutagen sensitivity and decreases excision repair. Genetics, 84: 527-544.

[0343] Boyd, J. B., Golino, M. D., Shaw, C. V., Osgood, C. J., and Green, M. M. (1981). Third-choromosome mutagen-sensitive mutants of Drosophila melanogaster. Genetics, 97: 607-623.

[0344] Boyd, J. B., Mason, J. M., Yamamoto, A. H., Brodberg, R. K., Banga, S. S., and Sakaguchi, K. (1987). A genetic and molecular analysis of DNA repair in Drosophila. J. Cell Sci. Supplement, 6: 39-60.

[0345] Boyd, J. B., and Setlow, R. B. (1976). Characterization of postreplication repair in mutagen-sensitive strains of Drosophila melanogaster. Genetics, 84: 507-526.

[0346] Brown, T. C., and Boyd, J. B. (1981). Postreplication repair-defective mutants of Drosophila melanogaster fall into two classes. Mol. Gen. Genet., 183: 356-362. Callebaut, I., and Mornon, J.-P. (1997). From BRCA1 to RAP1: a widespread BRCT module closely associated with DNA repair. FEBS Letters, 400: 25-30.

[0347] Cavener, D. R., and Ray, S. C. (1991). Eukaryotic start and stop translation sites. Nucleic Acids Res., 25: 3185-3192.

[0348] Fenech, M., Carr, A. M., Murray, J., Watts, F. Z., and Lehmann, A. R. (1991). Cloning and characterization of the rad4 gene of Schizosaccharomyces pombe, a gene showing short regions of sequence similarity to the human XRCC1 gene. Nucleic Acids Res., 19: 6737-6741.

[0349] Gatti, M. (1979). Genetic control of chromosome breakage and rejoining in Drosophila melanogaster: spontaneous chromosome aberration in x4inked mutants defective in DNA metabolism. Proc. Natl. Acad Sci. (USA), 76: 1377-1381.

[0350] Gatti, M., Smith, D. A., and Baker, B. S. (1983). A gene controlling condensation of heterochromatin in Drosophila melanogaster. Science, 221: 83-85.

[0351] Gowen, L. C., Avrustskaya, A. V., Latour, A. M., Koller, B. H., and Leadon, S. A. (1998). BRCA1 required for transcription-coupled repair of oxidative damage. Science, 281: 1009-1012.

[0352] Greenwell, P. W., Kronmal, S. L., Porter, S. E., Gassenhuber, J., Obermaier, B., and Petes, T. D. (1995). TEL1, a gene involved in controlling telomere length in S. cerevisiae, is homologous to the human Ataxia Telangiectasia gene. Cell, 82: 823-829.

[0353] Henderson, D. H., Bailey, D. A., Sinclair, D. A. R., and Grigliatti, T. A. (1987). Isolation and characterization of second chromosome mutagen-sensitive mutations in Drosophila melanogaster. Mutation Res., 177: 83-93.

[0354] Karess, R. E. (1985). P-element mediated germline transformation of Drosophila. In DNA cloning, D. M. Glover, ed. (Oxford: IRL Press).

[0355] Komitopoulou, K., Gans, M., Margaritis, L. H., Kafatos, F. C., and Masson, M. (1983). Isolation and characterization of sex-linked female-sterile mutants in Drosophila melanogaster with special attention to eggshell mutants. Genetics, 105: 897-920.

[0356] Komitopoulou, K., Margaritis, L. H., and Kafatos, F. C. (1988). Structural and biochemical studies on four sex-linked chorion mutants of Drosophila melanogaster. Dev. Genet., 9: 37-48.

[0357] Koonin, E. V., Altschul, S. F., and Bork, P. (1996). BRCA1 protein products: functional motifs. Nature Genet., 13: 266-268.

[0358] Levis, R., Hazelrigg, T., and Rubin, G. M. (1985). Separable cis-acting elements for expression of the white gene of Drosophila. EMBO J, 4: 3489-3499.

[0359] Lindsley, D., and Zimm, G. (1992). The genome of Drosophila melanogaster (New York: Academic Press).

[0360] Matsukage, A., Hirose, F., Hayashi, Y., Hamada, K, and Yamaguchi, M. (1995). The DRE sequence TATCGATA, a putative promoter-activating element for Drosophila melanogaster cell-proliferation-related genes. Gene, 166: 233-236.

[0361] Nagase, T., Ishikawa, K., Miyajima, N., Tanaka, A., Kotani, H., Nomura, N., and Ohara, O. (1998). Prediction of the coding sequences of unidentified human genes. IX. The complete sequences of 100 new cDNA clones from brain which can code for large proteins in vitro. DNA Res., 5: 31-39.

[0362] Nagase, T., Seki, N., Ishikawa, K., Ohira, M., Kawarabayasi, Y., Ohara, O., Tanaka, A., Kotani, H., Miyajima, N., and Nomura, N. (1996). Prediction of the coding sequences of unidentified human genes. VI. The coding sequences of 80 new genes (KIAA0201-KIAA0280) deduced by analysis of cDNA clones from cell line KG-1 and brain. DNA Res., 3: 321-329.

[0363] Orr, W., Komitopoulou, K., and Kafatos, F. (1984). Mutants supressing in trans chorion gene amplification in Drosophila. Proc. Natl. Acad. Sci (USA), 81:. 3773-3777.

[0364] Orr-Weaver, T. L. (1991). Drosophila chorion genes: cracking the eggshell's secrets. BioEssays, 13: 97-105.

[0365] Perrimon, N., and Gans, M. (1983). Clonal analysis of the tissue specificity of recessive female-sterile mutations of Drosophila melanogaster using a dominant female-sterile mutation fs(1)K1237. Developmental Biology, 100: 365-373.

[0366] Pirrota, V. (1995). Chromatin complexes regulating gene expression in Drosophila. Current Opinion in Genetics & Development, 5: 466-472.

[0367] Pirrota, V. (1988). Vectors for P-element transformation. In Drosophila: A pratical approach, R. L. Rodriguez and D. T. Denhardt, eds. (Boston and London: Butterworths), pp. 437-456.

[0368] Poole, S. J., Kauvar, L. M., Drees, B., and Kornberg, T. (1985). The engrailed locus of Drosophila: structural analysis of an embryonic transcript. Cell, 40: 37-43.

[0369] Roberts, D. B. (1986). Drosophila: a pratical approach (Oxford: IRL Press).

[0370] Saka, Y., and Yanagida, M. (1993). Fission yeast cut5+, required for S phase onset and M phase restraint, is identical to the radiation-damage repair gene rad4+. Cell, 74: 383-393.

[0371] Sambrook, J., Fritsch, E. F., and Maniatis, T. (1989). Molecular cloning: a Laboratory manual (Cold Spring Harbour: Laboratory Press).

[0372] Smith, D. (1976). Mutagen sensitivity of Drosophila melanogaster. Mol. Gen. Genet., 149: 73-85.

[0373] Smith, D. A., Baker, B. S., and Gatti, M. (1985). Mutations in genes encoding essential mitotic functions in Drosophila melanogaster. Genetics, 110: 647-670.

[0374] Snyder, R D., and Smith, P. D. (1982). Mutagen sensitivity of Drosophila melanogaster. V. Identification of second chromosomal mutagen sensitive strains. Mol. Gen. Genet., 188: 249-255.

[0375] Spradling, A. C. (1981). The organization and amplification of two chromosomal domains containing Drosophila chorion genes. Cell, 27: 193-201.

[0376] Spradling, A. C., and Mahowald, A. P. (1980). Amplification of genes for chorion proteins during oogenesis in Drosophila melanogaster. Proc. Natl. Acad Sci. (USA), 77: 1096-1100.

[0377] Vargas-Weiz, P. D., Wilm, M., Bonte, E., Dumas, K., Mann, M., and Becker, P. B. (1997). Chromatin-remodelling factor CHRAC contains the ATPases ISWI and Topoisomerase II. Nature, 388: 598-602.

[0378] Wise, C. A., Chiang, L. C., Pazaekas, W. A., Sharma, M., Musy, M. M., Ashley, J. A., Lovett, M., and Jabs, E. W. (1997). TCOF1 gene encodes a putative nuclear phosphoprotein that exhibits mutations in Treacher Collins Syndrome throughout its coding region. Proc. Natl. Acad Sci. (USA), 94: 3110-3115.

[0379] Wu, L. C., Wang, Z. W., Tsan, J. T., Spillman, M. A., Phung, A., Xu, X. L., Yang, M. C., Hwang, L. Y., Bowcock, A. M., and Baer, R. (1996). Identification of a ring protein that can interact in vivo with the BRCA1 gene product. Nature Genet., 14: 430440.

[0380] Wyckoff, E., Natalie, D., Nolan, J., Lee, M., and Hsieh, T.-s. (1989). Structure of the Drosophila DNA Topoisomerase II gene. Nucleotide sequence and homology among Topoisomerases II. J. Mol. Biol., 205: 1-13.

[0381] Yamamoto, Y., Girard, F., Bello, B., Affolter, A., and Gehring, W. J. (1997). The cramped gene of Drosophila is a member of the Polycomb-group, and interacts with mus2O9, the gene encoding Proliferating Cell Nuclear Antigen. Development, 124: 3385-3394.

[0382] Yamane, K., Kawabata, M., and Tsuruo, T. (1997). A DNA-topoisomerase-II-binding protein with eight repeating regions similar to DNA-repair enzymes and to cell-cycle regulator. Eur. J Biochem., 250: 794-799.

[0383]

1 48 1 5276 DNA Drosophila sp. CDS (409)..(4686) 1 ttgcctttcc tttcttcttt cttcctttct tttaggactg agctgccgtc gacgacgtgg 60 atatcgatat ttcgcttaat gaaatatatc gatagcttaa accaaaactg ataaataact 120 gcttattttc gaaatatttt atttatttga taaaaagtaa ataaaaaatc cttacatctt 180 agagataaat aattaaaatt acccatacac attatttttt aattcgccgt cggcgttgcc 240 acatagctca actcgtagtt gccaactaat cacaagccag cgttgccaac ggtttgaagg 300 caataacaaa cggcgccaaa atagcgtacg cgaaaaaaag tgcagtgcga aaatcacggc 360 aattttgcag cgcatcacgg agcagagaac acactcgcaa ccgccatc atg agc ata 417 Met Ser Ile 1 agc atg agc atg gac gag acc atc tgc gcg tac ttc gtg aac aat ctg 465 Ser Met Ser Met Asp Glu Thr Ile Cys Ala Tyr Phe Val Asn Asn Leu 5 10 15 aag ccg ggc gac ggt ggc gtc cag gag gcc gac aca ctg cag caa ttc 513 Lys Pro Gly Asp Gly Gly Val Gln Glu Ala Asp Thr Leu Gln Gln Phe 20 25 30 35 gag gcg gcg cgg gag cta ctg ggc caa cag ttg gcg gaa acg cag atc 561 Glu Ala Ala Arg Glu Leu Leu Gly Gln Gln Leu Ala Glu Thr Gln Ile 40 45 50 cgg caa ata aag cca agc gaa gga tat ccc ctg atc gca gcc ggt aac 609 Arg Gln Ile Lys Pro Ser Glu Gly Tyr Pro Leu Ile Ala Ala Gly Asn 55 60 65 ctt acc aag aag gac gtc ttt gtg ctg acc cag ttc gag ggc gaa ttc 657 Leu Thr Lys Lys Asp Val Phe Val Leu Thr Gln Phe Glu Gly Glu Phe 70 75 80 ttc gag caa ttg cag cag acg cga gca cta att ctg ggg cca ccg tgc 705 Phe Glu Gln Leu Gln Gln Thr Arg Ala Leu Ile Leu Gly Pro Pro Cys 85 90 95 ctc atc acc tgc ctg cgg cgc aat gaa ccc att ccc gag ggc agc agt 753 Leu Ile Thr Cys Leu Arg Arg Asn Glu Pro Ile Pro Glu Gly Ser Ser 100 105 110 115 gcc atc tac agc acg gcc atg cgg gat ctg cag gtc tcg gcc acg ggc 801 Ala Ile Tyr Ser Thr Ala Met Arg Asp Leu Gln Val Ser Ala Thr Gly 120 125 130 ata aca cca cag aag aaa gag gaa ttg agc agg ctc ata aac tgg atg 849 Ile Thr Pro Gln Lys Lys Glu Glu Leu Ser Arg Leu Ile Asn Trp Met 135 140 145 ggc ggc ata tac ttt caa agc ttc ggg cat cgc acc acc cac ctc att 897 Gly Gly Ile Tyr Phe Gln Ser Phe Gly His Arg Thr Thr His Leu Ile 150 155 160 tcg aac acc atc aag tcc agc aag tac gag cag gca acg ctg aac gga 945 Ser Asn Thr Ile Lys Ser Ser Lys Tyr Glu Gln Ala Thr Leu Asn Gly 165 170 175 gta ccc gta atg cac gtc gac tgg gtg cag tac gtc tgg gat cag agt 993 Val Pro Val Met His Val Asp Trp Val Gln Tyr Val Trp Asp Gln Ser 180 185 190 195 cgt cgc agc cag cgc gag ggc atc atg gcc acg gat cct gat ttc gat 1041 Arg Arg Ser Gln Arg Glu Gly Ile Met Ala Thr Asp Pro Asp Phe Asp 200 205 210 aag tat cgc ctg ccc att ttc ttt ggt gcg aat atc acg tgc agt gga 1089 Lys Tyr Arg Leu Pro Ile Phe Phe Gly Ala Asn Ile Thr Cys Ser Gly 215 220 225 ttg gat gtg gcg cgc aag gat caa gtt atg cgg ctg gtc aac gat aat 1137 Leu Asp Val Ala Arg Lys Asp Gln Val Met Arg Leu Val Asn Asp Asn 230 235 240 gga ggc atc tat cat cgt gcc ttt cgc tcc cag gtg gtg gac atc gtc 1185 Gly Gly Ile Tyr His Arg Ala Phe Arg Ser Gln Val Val Asp Ile Val 245 250 255 atc acc gag caa aca aaa acg gac acc gag aag tat aag gca gcc ata 1233 Ile Thr Glu Gln Thr Lys Thr Asp Thr Glu Lys Tyr Lys Ala Ala Ile 260 265 270 275 cgc tac aag aag gat gtc ttg ctg ccg gaa tgg atc ttc gat agc tgc 1281 Arg Tyr Lys Lys Asp Val Leu Leu Pro Glu Trp Ile Phe Asp Ser Cys 280 285 290 aat cgc ggc tac gct ctg ccc aca aag gac tat gag gtg cgg cct ggc 1329 Asn Arg Gly Tyr Ala Leu Pro Thr Lys Asp Tyr Glu Val Arg Pro Gly 295 300 305 aag acg tcg tcc aca ccc acc aag acc acg cgt ccg ggc gca gct ccc 1377 Lys Thr Ser Ser Thr Pro Thr Lys Thr Thr Arg Pro Gly Ala Ala Pro 310 315 320 ggt gca gat caa acg cac ctc tcg gat ctc tca cgt atc agc ttc gtc 1425 Gly Ala Asp Gln Thr His Leu Ser Asp Leu Ser Arg Ile Ser Phe Val 325 330 335 tcc ggc tcg cgt cgc atg tgc agc gat ctt agt acc gtc aac gaa tcc 1473 Ser Gly Ser Arg Arg Met Cys Ser Asp Leu Ser Thr Val Asn Glu Ser 340 345 350 355 gtc agc agt gtg ggc agc agt tcg ccc gcc aag cag ctg ctc aag cag 1521 Val Ser Ser Val Gly Ser Ser Ser Pro Ala Lys Gln Leu Leu Lys Gln 360 365 370 gcg act agc agt ggc cgc aac tac cag cag gtg ctg gcc gag att gaa 1569 Ala Thr Ser Ser Gly Arg Asn Tyr Gln Gln Val Leu Ala Glu Ile Glu 375 380 385 ccg cgt cag gcg aaa aaa gcg ggc gcc ttt ctg gat ggc tgc tgt gtg 1617 Pro Arg Gln Ala Lys Lys Ala Gly Ala Phe Leu Asp Gly Cys Cys Val 390 395 400 tat ttg agt ggc ttc cgc tca gag gag cgt gag aag cta aac aga gtg 1665 Tyr Leu Ser Gly Phe Arg Ser Glu Glu Arg Glu Lys Leu Asn Arg Val 405 410 415 ctg aat acg ggc gga gcg acc cgc tac gat gag gcc aat gag ggc atc 1713 Leu Asn Thr Gly Gly Ala Thr Arg Tyr Asp Glu Ala Asn Glu Gly Ile 420 425 430 435 tca cac atc att gtg ggc caa ctg gat gac gcc gaa tac cga cag tgg 1761 Ser His Ile Ile Val Gly Gln Leu Asp Asp Ala Glu Tyr Arg Gln Trp 440 445 450 cag cgc gat ggt ctc atg ggt tca gtt cat gtg gtg cgc cta gat tgg 1809 Gln Arg Asp Gly Leu Met Gly Ser Val His Val Val Arg Leu Asp Trp 455 460 465 ctg ctg gag agc att cga gct ggt cgc gtg gtc agt gag ttg gtg cat 1857 Leu Leu Glu Ser Ile Arg Ala Gly Arg Val Val Ser Glu Leu Val His 470 475 480 cgt gtg tcg atg cca cag aat cga gaa cca gac gtt gcc tct cct gcc 1905 Arg Val Ser Met Pro Gln Asn Arg Glu Pro Asp Val Ala Ser Pro Ala 485 490 495 agc aag cga aca ctg cgc tcc atg aac cac agc ttc aag cag cca aca 1953 Ser Lys Arg Thr Leu Arg Ser Met Asn His Ser Phe Lys Gln Pro Thr 500 505 510 515 ttg ccc atc aag aag aag ctt ttc gat cag gaa ccg gat ccc gtg cag 2001 Leu Pro Ile Lys Lys Lys Leu Phe Asp Gln Glu Pro Asp Pro Val Gln 520 525 530 gaa cag gag cac gag gag ccg gat cat acg ctg ctg gat cag tac tca 2049 Glu Gln Glu His Glu Glu Pro Asp His Thr Leu Leu Asp Gln Tyr Ser 535 540 545 cag gat caa gga gca gtg gca caa ctg cca ccg gca gat gtt agt ctc 2097 Gln Asp Gln Gly Ala Val Ala Gln Leu Pro Pro Ala Asp Val Ser Leu 550 555 560 ctt caa cca gcg gca agt tcc acc caa atg gat ata cgc cag cga gtc 2145 Leu Gln Pro Ala Ala Ser Ser Thr Gln Met Asp Ile Arg Gln Arg Val 565 570 575 tcg gta gcc aat cca aaa cca ccc gct gag ggt ttg caa ttg ccg gat 2193 Ser Val Ala Asn Pro Lys Pro Pro Ala Glu Gly Leu Gln Leu Pro Asp 580 585 590 595 ctc agt gcc agc act cta tcc att gat ttc gat aag ctg gac tac ttc 2241 Leu Ser Ala Ser Thr Leu Ser Ile Asp Phe Asp Lys Leu Asp Tyr Phe 600 605 610 gcc ggt gtc tct gtt tat gtg cac agg gag tgt ttc aac gag gag ttc 2289 Ala Gly Val Ser Val Tyr Val His Arg Glu Cys Phe Asn Glu Glu Phe 615 620 625 ttc aac caa atg cta acc gaa tgc gaa gct gcc caa ggc tta ttg gtg 2337 Phe Asn Gln Met Leu Thr Glu Cys Glu Ala Ala Gln Gly Leu Leu Val 630 635 640 cca tcg agt ttc tcc gat gaa gtg gat ttc gcc att gtc agc ttt gag 2385 Pro Ser Ser Phe Ser Asp Glu Val Asp Phe Ala Ile Val Ser Phe Glu 645 650 655 gta gcc ttc gat gtg aag caa tta ccc gtc aag gcc cgc cat gtg gtc 2433 Val Ala Phe Asp Val Lys Gln Leu Pro Val Lys Ala Arg His Val Val 660 665 670 675 acc gaa ctg ttc ctg gaa agc tgt atg aaa aag aat caa ctg ctg ccc 2481 Thr Glu Leu Phe Leu Glu Ser Cys Met Lys Lys Asn Gln Leu Leu Pro 680 685 690 atc gaa tat tac cac aaa cat gtg ccg gct acc gca ctg cgt cag ccg 2529 Ile Glu Tyr Tyr His Lys His Val Pro Ala Thr Ala Leu Arg Gln Pro 695 700 705 ctt aag gga atg act att gtc gta tcc att tat gca gga ttg gag cgg 2577 Leu Lys Gly Met Thr Ile Val Val Ser Ile Tyr Ala Gly Leu Glu Arg 710 715 720 gac ttt att aat gcg aca gca gaa cta ctt ggc gcc tcc gtc aat aag 2625 Asp Phe Ile Asn Ala Thr Ala Glu Leu Leu Gly Ala Ser Val Asn Lys 725 730 735 aca ttc atc aag aag gag aaa ccg ctg ctg gtg tgt ccc agt gcc gag 2673 Thr Phe Ile Lys Lys Glu Lys Pro Leu Leu Val Cys Pro Ser Ala Glu 740 745 750 755 ggc tcc aag tat gaa ggt gcc atc aaa tgg aac tat ccc gta gtc aca 2721 Gly Ser Lys Tyr Glu Gly Ala Ile Lys Trp Asn Tyr Pro Val Val Thr 760 765 770 tcc gat tgg ctg gtg cag tgc gcc cgc act ggt cag aag ctg ccc ttc 2769 Ser Asp Trp Leu Val Gln Cys Ala Arg Thr Gly Gln Lys Leu Pro Phe 775 780 785 gtt gga tat ttg gtg ggc aag agt ccc gag gat ttc ccc ata tcg cca 2817 Val Gly Tyr Leu Val Gly Lys Ser Pro Glu Asp Phe Pro Ile Ser Pro 790 795 800 cgc ttg cgg gac agc aat agc cgg aca gca aga aga ccg aat gaa tcc 2865 Arg Leu Arg Asp Ser Asn Ser Arg Thr Ala Arg Arg Pro Asn Glu Ser 805 810 815 aca ttg gtg gct caa ccg gat gta acc atg gag gag gcc gag aac caa 2913 Thr Leu Val Ala Gln Pro Asp Val Thr Met Glu Glu Ala Glu Asn Gln 820 825 830 835 ccg gcg gga tct gtc aca cca gtt act gct ggc agt cca gga gct cct 2961 Pro Ala Gly Ser Val Thr Pro Val Thr Ala Gly Ser Pro Gly Ala Pro 840 845 850 gaa ctg acg ccc ctg cgc aac aag aga gtt tcc gag cta gct gga ata 3009 Glu Leu Thr Pro Leu Arg Asn Lys Arg Val Ser Glu Leu Ala Gly Ile 855 860 865 cca gga ggc agt gct cgt cat cgt ggc acc agc tcc aca tct tct ccg 3057 Pro Gly Gly Ser Ala Arg His Arg Gly Thr Ser Ser Thr Ser Ser Pro 870 875 880 gac tca cca tgc acg cca ctt agc cag gtg ggt gcc cag caa tac aat 3105 Asp Ser Pro Cys Thr Pro Leu Ser Gln Val Gly Ala Gln Gln Tyr Asn 885 890 895 ctg gac ttc cta gag caa ttc gtt caa cgc ctg gat aca gaa gag ggc 3153 Leu Asp Phe Leu Glu Gln Phe Val Gln Arg Leu Asp Thr Glu Glu Gly 900 905 910 915 aag gat tgt gtg cgc gag att atc cgt gaa atg cgc gag aat caa acg 3201 Lys Asp Cys Val Arg Glu Ile Ile Arg Glu Met Arg Glu Asn Gln Thr 920 925 930 ccg gaa ttg gaa cgc att cga cgg cag gcc tgc acg ccc gtc agt cgt 3249 Pro Glu Leu Glu Arg Ile Arg Arg Gln Ala Cys Thr Pro Val Ser Arg 935 940 945 aag cat caa cga cca gca ccg gga att cca gat ttt tgt ctc act ccc 3297 Lys His Gln Arg Pro Ala Pro Gly Ile Pro Asp Phe Cys Leu Thr Pro 950 955 960 gag ttc cag cag cgg atg gcc gat gat ttt gag cgg cgc tgg cgc cta 3345 Glu Phe Gln Gln Arg Met Ala Asp Asp Phe Glu Arg Arg Trp Arg Leu 965 970 975 ccc acc atg aaa atc aaa cca gac aca ccg ttg gcc gtc atc agg cag 3393 Pro Thr Met Lys Ile Lys Pro Asp Thr Pro Leu Ala Val Ile Arg Gln 980 985 990 995 cgc gtg atg cgg atc aca tgc gaa act ctg ggc atc gaa tat gaa gaa 3441 Arg Val Met Arg Ile Thr Cys Glu Thr Leu Gly Ile Glu Tyr Glu Glu 1000 1005 1010 agt aat gct aag acg cca acg cta tcg gaa tcg cca tca acg gta aag 3489 Ser Asn Ala Lys Thr Pro Thr Leu Ser Glu Ser Pro Ser Thr Val Lys 1015 1020 1025 aag aag ccg cca acc agg acg acg cag gcc acc aaa ctc aac ttt gat 3537 Lys Lys Pro Pro Thr Arg Thr Thr Gln Ala Thr Lys Leu Asn Phe Asp 1030 1035 1040 aga tca ccg aaa aca ccg aag cta tcg cta ggt aaa aag acg cca ctc 3585 Arg Ser Pro Lys Thr Pro Lys Leu Ser Leu Gly Lys Lys Thr Pro Leu 1045 1050 1055 cgg gtg tcc atg ggt tca ccc cgc agc gga acg caa tca ccc ttc gta 3633 Arg Val Ser Met Gly Ser Pro Arg Ser Gly Thr Gln Ser Pro Phe Val 1060 1065 1070 1075 ccg aac aca cag agt cca atc gaa gca gcg ccg cca cgg cgt tca gat 3681 Pro Asn Thr Gln Ser Pro Ile Glu Ala Ala Pro Pro Arg Arg Ser Asp 1080 1085 1090 ggt cca acg tta agc gag gaa ggt cag agc act att aac ttt gac aag 3729 Gly Pro Thr Leu Ser Glu Glu Gly Gln Ser Thr Ile Asn Phe Asp Lys 1095 1100 1105 att agc ttc gag gag tcg gcc gtg ccc gta ttg gcc cca gat gtt ccc 3777 Ile Ser Phe Glu Glu Ser Ala Val Pro Val Leu Ala Pro Asp Val Pro 1110 1115 1120 aca gtg gcg cca gat gtc aag caa atc acc gac tat cta aag aac tgc 3825 Thr Val Ala Pro Asp Val Lys Gln Ile Thr Asp Tyr Leu Lys Asn Cys 1125 1130 1135 gaa tcg cga agg aac agt ctg aag cgt agc cac gac aac gac atg gat 3873 Glu Ser Arg Arg Asn Ser Leu Lys Arg Ser His Asp Asn Asp Met Asp 1140 1145 1150 1155 tgc ggc gag agc gag gtg cag tat gtg cag ccc ttc gag tcc gag ggc 3921 Cys Gly Glu Ser Glu Val Gln Tyr Val Gln Pro Phe Glu Ser Glu Gly 1160 1165 1170 ttc gcc ctg ggc acc gag gac atg gtc gac tgg cgc gat ccg gcg gag 3969 Phe Ala Leu Gly Thr Glu Asp Met Val Asp Trp Arg Asp Pro Ala Glu 1175 1180 1185 ttc aat gcg gca aag aga aga tcc tct ggc ggt tcg ccc aag atg cag 4017 Phe Asn Ala Ala Lys Arg Arg Ser Ser Gly Gly Ser Pro Lys Met Gln 1190 1195 1200 tat gct ggc ata ccg tgc ttc agc atc tcg tgc ggt gat gat gac gaa 4065 Tyr Ala Gly Ile Pro Cys Phe Ser Ile Ser Cys Gly Asp Asp Asp Glu 1205 1210 1215 aag cgg gcg gaa cta atc gcg cgg atc acc caa ctg ggc gga aaa gtg 4113 Lys Arg Ala Glu Leu Ile Ala Arg Ile Thr Gln Leu Gly Gly Lys Val 1220 1225 1230 1235 tgc gag aat ctg gtg aac tat gac gat agc tgc acc cat ctg ctg tgc 4161 Cys Glu Asn Leu Val Asn Tyr Asp Asp Ser Cys Thr His Leu Leu Cys 1240 1245 1250 gag cgt ccg aat cgc ggc gag aag atg ctg gcc tgc att gcg gcc ggg 4209 Glu Arg Pro Asn Arg Gly Glu Lys Met Leu Ala Cys Ile Ala Ala Gly 1255 1260 1265 aaa tgg ata cta aat atc cag tat atc gaa cag tcg cat gcg cgt ggc 4257 Lys Trp Ile Leu Asn Ile Gln Tyr Ile Glu Gln Ser His Ala Arg Gly 1270 1275 1280 gac ttt ctg gac gaa acg ctg tac gaa tgg ggc aac ccg aag gcc att 4305 Asp Phe Leu Asp Glu Thr Leu Tyr Glu Trp Gly Asn Pro Lys Ala Ile 1285 1290 1295 aac ttg ccc acc ctg gca ccg gag gag gag ccc atc gct gcc gcc gtc 4353 Asn Leu Pro Thr Leu Ala Pro Glu Glu Glu Pro Ile Ala Ala Ala Val 1300 1305 1310 1315 cac cgt tgg cgc acg gag ttg agt gcg tgc ggc ggt ggc gcc ttc tcc 4401 His Arg Trp Arg Thr Glu Leu Ser Ala Cys Gly Gly Gly Ala Phe Ser 1320 1325 1330 gat cac cgg gtt ata ctc agc atg aac gag agg agc ggg gcg ccc atc 4449 Asp His Arg Val Ile Leu Ser Met Asn Glu Arg Ser Gly Ala Pro Ile 1335 1340 1345 agg aat gtg ctg cgt gcc ggc ggc gct tgc atc ctg gag cca aca act 4497 Arg Asn Val Leu Arg Ala Gly Gly Ala Cys Ile Leu Glu Pro Thr Thr 1350 1355 1360 ccg ttc tcc aaa gat cct gtg gcc aaa agt gcc agc cat tgt ttt gtc 4545 Pro Phe Ser Lys Asp Pro Val Ala Lys Ser Ala Ser His Cys Phe Val 1365 1370 1375 gat gtg aag aag gcg ccg ctg tcg acc cag gac atg gag tat ctt cac 4593 Asp Val Lys Lys Ala Pro Leu Ser Thr Gln Asp Met Glu Tyr Leu His 1380 1385 1390 1395 aaa tgt ggt gtg cag gtg ctc agc cag att gcc atc aac gcc tat ttg 4641 Lys Cys Gly Val Gln Val Leu Ser Gln Ile Ala Ile Asn Ala Tyr Leu 1400 1405 1410 atg aac ggc agg gat gcc gat ctg gga aag tac cag ctc ctt tga 4686 Met Asn Gly Arg Asp Ala Asp Leu Gly Lys Tyr Gln Leu Leu 1415 1420 1425 acgaatcatt agattctttg tacatgcatt ggattttttt tttgttcctg ttctatattt 4746 tgatattttg tggttatata aaagttattg aattaaagcc ttcaaaggtt taatttaaaa 4806 taggtctagt ttaatcagtt gtaagtgtaa ttttgaattt ttgatttgtg tccttttctt 4866 ttggttactg aaaaaaatca taatatttgg aagcctttag cgtaatgaat tactaataaa 4926 ttaatatcgt taattaactt caattaactt caaaataaac gaactgaaat tgaagtccaa 4986 aaccatatga taattgtatt tagtgattta tttctaaagg tgttgttgaa aattaattct 5046 aagtcaaagc gcaaattatt tttgaaagaa atttgacgat tttctatttt tttattgaag 5106 ctaccttagt ttgtttcttt tattaatttc attccaactt caatttaata aacaaaatgt 5166 aatattaaaa acctaaagaa attctatttt tttcatcccg atttcgatca attacattat 5226 ttattttccg cagccaaatt aaacaataaa tcaatatgca agtgtattgt 5276 2 1425 PRT Drosophila sp. 2 Met Ser Ile Ser Met Ser Met Asp Glu Thr Ile Cys Ala Tyr Phe Val 1 5 10 15 Asn Asn Leu Lys Pro Gly Asp Gly Gly Val Gln Glu Ala Asp Thr Leu 20 25 30 Gln Gln Phe Glu Ala Ala Arg Glu Leu Leu Gly Gln Gln Leu Ala Glu 35 40 45 Thr Gln Ile Arg Gln Ile Lys Pro Ser Glu Gly Tyr Pro Leu Ile Ala 50 55 60 Ala Gly Asn Leu Thr Lys Lys Asp Val Phe Val Leu Thr Gln Phe Glu 65 70 75 80 Gly Glu Phe Phe Glu Gln Leu Gln Gln Thr Arg Ala Leu Ile Leu Gly 85 90 95 Pro Pro Cys Leu Ile Thr Cys Leu Arg Arg Asn Glu Pro Ile Pro Glu 100 105 110 Gly Ser Ser Ala Ile Tyr Ser Thr Ala Met Arg Asp Leu Gln Val Ser 115 120 125 Ala Thr Gly Ile Thr Pro Gln Lys Lys Glu Glu Leu Ser Arg Leu Ile 130 135 140 Asn Trp Met Gly Gly Ile Tyr Phe Gln Ser Phe Gly His Arg Thr Thr 145 150 155 160 His Leu Ile Ser Asn Thr Ile Lys Ser Ser Lys Tyr Glu Gln Ala Thr 165 170 175 Leu Asn Gly Val Pro Val Met His Val Asp Trp Val Gln Tyr Val Trp 180 185 190 Asp Gln Ser Arg Arg Ser Gln Arg Glu Gly Ile Met Ala Thr Asp Pro 195 200 205 Asp Phe Asp Lys Tyr Arg Leu Pro Ile Phe Phe Gly Ala Asn Ile Thr 210 215 220 Cys Ser Gly Leu Asp Val Ala Arg Lys Asp Gln Val Met Arg Leu Val 225 230 235 240 Asn Asp Asn Gly Gly Ile Tyr His Arg Ala Phe Arg Ser Gln Val Val 245 250 255 Asp Ile Val Ile Thr Glu Gln Thr Lys Thr Asp Thr Glu Lys Tyr Lys 260 265 270 Ala Ala Ile Arg Tyr Lys Lys Asp Val Leu Leu Pro Glu Trp Ile Phe 275 280 285 Asp Ser Cys Asn Arg Gly Tyr Ala Leu Pro Thr Lys Asp Tyr Glu Val 290 295 300 Arg Pro Gly Lys Thr Ser Ser Thr Pro Thr Lys Thr Thr Arg Pro Gly 305 310 315 320 Ala Ala Pro Gly Ala Asp Gln Thr His Leu Ser Asp Leu Ser Arg Ile 325 330 335 Ser Phe Val Ser Gly Ser Arg Arg Met Cys Ser Asp Leu Ser Thr Val 340 345 350 Asn Glu Ser Val Ser Ser Val Gly Ser Ser Ser Pro Ala Lys Gln Leu 355 360 365 Leu Lys Gln Ala Thr Ser Ser Gly Arg Asn Tyr Gln Gln Val Leu Ala 370 375 380 Glu Ile Glu Pro Arg Gln Ala Lys Lys Ala Gly Ala Phe Leu Asp Gly 385 390 395 400 Cys Cys Val Tyr Leu Ser Gly Phe Arg Ser Glu Glu Arg Glu Lys Leu 405 410 415 Asn Arg Val Leu Asn Thr Gly Gly Ala Thr Arg Tyr Asp Glu Ala Asn 420 425 430 Glu Gly Ile Ser His Ile Ile Val Gly Gln Leu Asp Asp Ala Glu Tyr 435 440 445 Arg Gln Trp Gln Arg Asp Gly Leu Met Gly Ser Val His Val Val Arg 450 455 460 Leu Asp Trp Leu Leu Glu Ser Ile Arg Ala Gly Arg Val Val Ser Glu 465 470 475 480 Leu Val His Arg Val Ser Met Pro Gln Asn Arg Glu Pro Asp Val Ala 485 490 495 Ser Pro Ala Ser Lys Arg Thr Leu Arg Ser Met Asn His Ser Phe Lys 500 505 510 Gln Pro Thr Leu Pro Ile Lys Lys Lys Leu Phe Asp Gln Glu Pro Asp 515 520 525 Pro Val Gln Glu Gln Glu His Glu Glu Pro Asp His Thr Leu Leu Asp 530 535 540 Gln Tyr Ser Gln Asp Gln Gly Ala Val Ala Gln Leu Pro Pro Ala Asp 545 550 555 560 Val Ser Leu Leu Gln Pro Ala Ala Ser Ser Thr Gln Met Asp Ile Arg 565 570 575 Gln Arg Val Ser Val Ala Asn Pro Lys Pro Pro Ala Glu Gly Leu Gln 580 585 590 Leu Pro Asp Leu Ser Ala Ser Thr Leu Ser Ile Asp Phe Asp Lys Leu 595 600 605 Asp Tyr Phe Ala Gly Val Ser Val Tyr Val His Arg Glu Cys Phe Asn 610 615 620 Glu Glu Phe Phe Asn Gln Met Leu Thr Glu Cys Glu Ala Ala Gln Gly 625 630 635 640 Leu Leu Val Pro Ser Ser Phe Ser Asp Glu Val Asp Phe Ala Ile Val 645 650 655 Ser Phe Glu Val Ala Phe Asp Val Lys Gln Leu Pro Val Lys Ala Arg 660 665 670 His Val Val Thr Glu Leu Phe Leu Glu Ser Cys Met Lys Lys Asn Gln 675 680 685 Leu Leu Pro Ile Glu Tyr Tyr His Lys His Val Pro Ala Thr Ala Leu 690 695 700 Arg Gln Pro Leu Lys Gly Met Thr Ile Val Val Ser Ile Tyr Ala Gly 705 710 715 720 Leu Glu Arg Asp Phe Ile Asn Ala Thr Ala Glu Leu Leu Gly Ala Ser 725 730 735 Val Asn Lys Thr Phe Ile Lys Lys Glu Lys Pro Leu Leu Val Cys Pro 740 745 750 Ser Ala Glu Gly Ser Lys Tyr Glu Gly Ala Ile Lys Trp Asn Tyr Pro 755 760 765 Val Val Thr Ser Asp Trp Leu Val Gln Cys Ala Arg Thr Gly Gln Lys 770 775 780 Leu Pro Phe Val Gly Tyr Leu Val Gly Lys Ser Pro Glu Asp Phe Pro 785 790 795 800 Ile Ser Pro Arg Leu Arg Asp Ser Asn Ser Arg Thr Ala Arg Arg Pro 805 810 815 Asn Glu Ser Thr Leu Val Ala Gln Pro Asp Val Thr Met Glu Glu Ala 820 825 830 Glu Asn Gln Pro Ala Gly Ser Val Thr Pro Val Thr Ala Gly Ser Pro 835 840 845 Gly Ala Pro Glu Leu Thr Pro Leu Arg Asn Lys Arg Val Ser Glu Leu 850 855 860 Ala Gly Ile Pro Gly Gly Ser Ala Arg His Arg Gly Thr Ser Ser Thr 865 870 875 880 Ser Ser Pro Asp Ser Pro Cys Thr Pro Leu Ser Gln Val Gly Ala Gln 885 890 895 Gln Tyr Asn Leu Asp Phe Leu Glu Gln Phe Val Gln Arg Leu Asp Thr 900 905 910 Glu Glu Gly Lys Asp Cys Val Arg Glu Ile Ile Arg Glu Met Arg Glu 915 920 925 Asn Gln Thr Pro Glu Leu Glu Arg Ile Arg Arg Gln Ala Cys Thr Pro 930 935 940 Val Ser Arg Lys His Gln Arg Pro Ala Pro Gly Ile Pro Asp Phe Cys 945 950 955 960 Leu Thr Pro Glu Phe Gln Gln Arg Met Ala Asp Asp Phe Glu Arg Arg 965 970 975 Trp Arg Leu Pro Thr Met Lys Ile Lys Pro Asp Thr Pro Leu Ala Val 980 985 990 Ile Arg Gln Arg Val Met Arg Ile Thr Cys Glu Thr Leu Gly Ile Glu 995 1000 1005 Tyr Glu Glu Ser Asn Ala Lys Thr Pro Thr Leu Ser Glu Ser Pro Ser 1010 1015 1020 Thr Val Lys Lys Lys Pro Pro Thr Arg Thr Thr Gln Ala Thr Lys Leu 1025 1030 1035 1040 Asn Phe Asp Arg Ser Pro Lys Thr Pro Lys Leu Ser Leu Gly Lys Lys 1045 1050 1055 Thr Pro Leu Arg Val Ser Met Gly Ser Pro Arg Ser Gly Thr Gln Ser 1060 1065 1070 Pro Phe Val Pro Asn Thr Gln Ser Pro Ile Glu Ala Ala Pro Pro Arg 1075 1080 1085 Arg Ser Asp Gly Pro Thr Leu Ser Glu Glu Gly Gln Ser Thr Ile Asn 1090 1095 1100 Phe Asp Lys Ile Ser Phe Glu Glu Ser Ala Val Pro Val Leu Ala Pro 1105 1110 1115 1120 Asp Val Pro Thr Val Ala Pro Asp Val Lys Gln Ile Thr Asp Tyr Leu 1125 1130 1135 Lys Asn Cys Glu Ser Arg Arg Asn Ser Leu Lys Arg Ser His Asp Asn 1140 1145 1150 Asp Met Asp Cys Gly Glu Ser Glu Val Gln Tyr Val Gln Pro Phe Glu 1155 1160 1165 Ser Glu Gly Phe Ala Leu Gly Thr Glu Asp Met Val Asp Trp Arg Asp 1170 1175 1180 Pro Ala Glu Phe Asn Ala Ala Lys Arg Arg Ser Ser Gly Gly Ser Pro 1185 1190 1195 1200 Lys Met Gln Tyr Ala Gly Ile Pro Cys Phe Ser Ile Ser Cys Gly Asp 1205 1210 1215 Asp Asp Glu Lys Arg Ala Glu Leu Ile Ala Arg Ile Thr Gln Leu Gly 1220 1225 1230 Gly Lys Val Cys Glu Asn Leu Val Asn Tyr Asp Asp Ser Cys Thr His 1235 1240 1245 Leu Leu Cys Glu Arg Pro Asn Arg Gly Glu Lys Met Leu Ala Cys Ile 1250 1255 1260 Ala Ala Gly Lys Trp Ile Leu Asn Ile Gln Tyr Ile Glu Gln Ser His 1265 1270 1275 1280 Ala Arg Gly Asp Phe Leu Asp Glu Thr Leu Tyr Glu Trp Gly Asn Pro 1285 1290 1295 Lys Ala Ile Asn Leu Pro Thr Leu Ala Pro Glu Glu Glu Pro Ile Ala 1300 1305 1310 Ala Ala Val His Arg Trp Arg Thr Glu Leu Ser Ala Cys Gly Gly Gly 1315 1320 1325 Ala Phe Ser Asp His Arg Val Ile Leu Ser Met Asn Glu Arg Ser Gly 1330 1335 1340 Ala Pro Ile Arg Asn Val Leu Arg Ala Gly Gly Ala Cys Ile Leu Glu 1345 1350 1355 1360 Pro Thr Thr Pro Phe Ser Lys Asp Pro Val Ala Lys Ser Ala Ser His 1365 1370 1375 Cys Phe Val Asp Val Lys Lys Ala Pro Leu Ser Thr Gln Asp Met Glu 1380 1385 1390 Tyr Leu His Lys Cys Gly Val Gln Val Leu Ser Gln Ile Ala Ile Asn 1395 1400 1405 Ala Tyr Leu Met Asn Gly Arg Asp Ala Asp Leu Gly Lys Tyr Gln Leu 1410 1415 1420 Leu 1425 3 14 PRT Drosophila sp. 3 Ala Met Arg Asp Leu Gln Val Ser Ala Thr Gly Ile Thr Pro 1 5 10 4 16 PRT Drosophila sp. 4 Lys Glu Glu Leu Ser Arg Leu Ile Asn Trp Met Gly Gly Ile Tyr Phe 1 5 10 15 5 11 PRT Drosophila sp. 5 His Arg Thr Thr His Leu Ile Ser Asn Thr Ile 1 5 10 6 27 PRT Drosophila sp. 6 Gly Val Pro Val Met His Val Asp Trp Val Gln Tyr Val Trp Asp Gln 1 5 10 15 Ser Arg Arg Ser Gln Arg Glu Gly Ile Met Ala 20 25 7 14 PRT Homo sapiens 7 Val Met Ser Asp Val Thr Ile Ser Cys Thr Ser Leu Glu Lys 1 5 10 8 16 PRT Homo sapiens 8 Arg Glu Glu Val His Lys Tyr Val Gln Met Met Gly Gly Arg Val Tyr 1 5 10 15 9 11 PRT Homo sapiens 9 Val Ser Val Thr His Leu Ile Ala Gly Glu Val 1 5 10 10 25 PRT Homo sapiens 10 Lys Lys Pro Ile Leu Leu Pro Ser Trp Ile Lys Thr Leu Trp Glu Lys 1 5 10 15 Ser Gln Glu Lys Lys Ile Thr Arg Tyr 20 25 11 14 PRT Caenorhabditis elegans 11 Val Phe Gln Asp Val Lys Ile Ser Phe Thr Gly Leu Asn Leu 1 5 10 12 16 PRT Caenorhabditis elegans 12 Lys Gln Glu Leu Tyr Glu Lys Ile Gly Trp Met Cys Gly Val Val Gly 1 5 10 15 13 11 PRT Caenorhabditis elegans 13 His Glu Thr Thr His Leu Val Thr Glu Lys Ala 1 5 10 14 27 PRT Caenorhabditis elegans 14 Ser Ile Lys Leu Met Arg Ile Gly Trp Ile Asp Asp Leu Trp Glu Thr 1 5 10 15 Ser Gln Thr Thr Met Gly Arg Phe Ser Ala Leu 20 25 15 14 PRT Schizosaccharomyces pombe 15 Leu Lys Gly Phe Val Ile Cys Cys Thr Ser Ile Asp Leu Lys 1 5 10 16 16 PRT Schizosaccharomyces pombe 16 Arg Thr Glu Ile Ser Thr Lys Ala Thr Lys Leu Gly Ala Ala Tyr Arg 1 5 10 15 17 11 PRT Schizosaccharomyces pombe 17 Lys Asp Val Thr His Leu Ile Ala Gly Asp Phe 1 5 10 18 27 PRT Schizosaccharomyces pombe 18 Trp Ile Pro Val Leu Tyr Glu Ser Trp Val Gln Gly Glu Asp Leu Asp 1 5 10 15 Asp Gly Leu Leu Val Asp Lys His Phe Leu Pro 20 25 19 13 PRT Mus sp. 19 Met Leu Asn Leu Val Leu Cys Phe Thr Gly Phe Arg Lys 1 5 10 20 16 PRT Mus sp. 20 Leu Val Lys Leu Val Thr Leu Val His His Met Gly Gly Val Ile Arg 1 5 10 15 21 11 PRT Mus sp. 21 Ser Lys Val Thr His Leu Val Ala Asn Cys Thr 1 5 10 22 27 PRT Mus sp. 22 Gly Thr Pro Ile Met Lys Pro Glu Trp Ile Tyr Lys Ala Trp Glu Arg 1 5 10 15 Arg Asn Glu Gln Cys Phe Cys Ala Ala Val Asp 20 25 23 723 PRT Drosophila sp. SITE (212)..(227) Xaa is uncertain 23 Ile Leu Gly Pro Pro Cys Leu Ile Thr Cys Leu Arg Arg Asn Glu Pro 1 5 10 15 Ile Pro Glu Gly Ser Ser Ala Ile Tyr Ser Thr Ala Met Arg Asp Leu 20 25 30 Gln Val Ser Ala Thr Gly Ile Thr Pro Gln Lys Lys Glu Glu Leu Ser 35 40 45 Arg Leu Ile Asn Trp Met Gly Gly Ile Tyr Phe Gln Ser Phe Gly His 50 55 60 Arg Thr Thr His Leu Ile Ser Asn Thr Ile Lys Ser Ser Lys Tyr Glu 65 70 75 80 Gln Ala Thr Leu Asn Gly Val Pro Val Met His Val Asp Trp Val Gln 85 90 95 Tyr Val Trp Asp Gln Ser Arg Arg Ser Gln Arg Glu Gly Ile Met Ala 100 105 110 Thr Asp Pro Asp Phe Asp Lys Tyr Arg Leu Pro Ile Phe Phe Gly Ala 115 120 125 Asn Ile Thr Cys Ser Gly Leu Asp Val Ala Arg Lys Asp Gln Val Met 130 135 140 Arg Leu Val Asn Asp Asn Gly Gly Ile Tyr His Arg Ala Phe Arg Ser 145 150 155 160 Gln Val Val Asp Ile Val Ile Thr Glu Gln Thr Lys Thr Asp Thr Glu 165 170 175 Lys Tyr Lys Ala Ala Ile Arg Tyr Lys Lys Asp Val Leu Leu Pro Glu 180 185 190 Trp Ile Phe Asp Ser Cys Asn Arg Gly Tyr Ala Leu Pro Thr Lys Asp 195 200 205 Tyr Glu Val Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa 210 215 220 Xaa Xaa Xaa Ala Ala Pro Gly Ala Asp Gln Thr His Leu Ser Asp Leu 225 230 235 240 Ser Arg Ile Ser Phe Val Ser Gly Ser Arg Arg Met Cys Ser Asp Leu 245 250 255 Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Pro Ala 260 265 270 Lys Gln Leu Leu Lys Gln Ala Thr Ser Ser Gly Arg Asn Tyr Gln Gln 275 280 285 Val Leu Ala Glu Ile Glu Pro Arg Gln Ala Lys Lys Ala Gly Ala Phe 290 295 300 Leu Asp Gly Cys Cys Val Tyr Leu Ser Gly Phe Arg Ser Glu Glu Arg 305 310 315 320 Glu Lys Leu Asn Arg Val Leu Asn Thr Gly Gly Ala Thr Arg Tyr Asp 325 330 335 Glu Ala Asn Glu Gly Ile Ser His Ile Ile Val Gly Gln Leu Asp Asp 340 345 350 Ala Glu Tyr Arg Gln Trp Gln Arg Asp Gly Leu Met Gly Ser Val His 355 360 365 Val Val Arg Leu Asp Trp Leu Leu Glu Ser Ile Arg Ala Gly Arg Val 370 375 380 Val Ser Glu Leu Val His Arg Val Ser Met Pro Gln Asn Arg Glu Pro 385 390 395 400 Asp Val Ala Ser Pro Ala Ser Lys Arg Thr Leu Arg Ser Met Asn His 405 410 415 Ser Phe Lys Gln Pro Thr Leu Pro Ile Lys Lys Lys Leu Phe Asp Gln 420 425 430 Glu Pro Asp Pro Val Gln Glu Gln Glu His Glu Glu Pro Asp His Thr 435 440 445 Leu Leu Asp Gln Tyr Ser Gln Asp Gln Gly Ala Val Ala Gln Leu Pro 450 455 460 Pro Ala Asp Val Ser Leu Leu Gln Pro Ala Ala Ser Ser Thr Gln Met 465 470 475 480 Asp Ile Arg Gln Arg Val Ser Val Ala Asn Pro Lys Pro Pro Ala Glu 485 490 495 Gly Leu Gln Leu Pro Asp Leu Ser Ala Ser Thr Leu Ser Ile Asp Phe 500 505 510 Asp Lys Leu Asp Tyr Phe Ala Gly Val Ser Val Tyr Val His Arg Glu 515 520 525 Cys Phe Asn Glu Glu Phe Phe Asn Gln Met Leu Thr Glu Cys Glu Ala 530 535 540 Ala Gln Gly Leu Leu Val Pro Ser Ser Phe Ser Asp Glu Val Asp Phe 545 550 555 560 Ala Ile Val Ser Phe Glu Val Ala Phe Asp Val Lys Gln Leu Pro Val 565 570 575 Lys Ala Arg His Val Val Thr Glu Leu Phe Leu Glu Ser Cys Met Lys 580 585 590 Lys Asn Gln Leu Leu Pro Ile Glu Tyr Tyr His Lys His Val Pro Ala 595 600 605 Thr Ala Leu Arg Gln Pro Leu Lys Gly Met Thr Ile Val Val Ser Ile 610 615 620 Tyr Ala Gly Leu Glu Arg Asp Phe Ile Asn Ala Thr Ala Glu Leu Leu 625 630 635 640 Gly Ala Ser Val Asn Lys Thr Phe Ile Lys Lys Glu Lys Pro Leu Leu 645 650 655 Val Cys Pro Ser Ala Glu Gly Ser Lys Tyr Glu Gly Ala Ile Lys Trp 660 665 670 Asn Tyr Pro Val Val Thr Ser Asp Trp Leu Val Gln Cys Ala Arg Thr 675 680 685 Gly Gln Lys Leu Pro Phe Val Gly Tyr Leu Val Gly Lys Ser Pro Glu 690 695 700 Asp Phe Pro Ile Ser Pro Arg Leu Arg Asp Ser Asn Ser Arg Thr Ala 705 710 715 720 Arg Arg Pro 24 689 PRT Homo sapiens 24 Ile Val Gly Pro Gln Val Val Ile Phe Cys Met His His Gln Arg Cys 1 5 10 15 Val Pro Arg Ala Glu His Pro Val Tyr Asn Met Val Met Ser Asp Val 20 25 30 Thr Ile Ser Cys Thr Ser Leu Glu Lys Glu Lys Arg Glu Glu Val His 35 40 45 Lys Tyr Val Gln Met Met Gly Gly Arg Val Tyr Arg Asp Leu Asn Val 50 55 60 Ser Val Thr His Leu Ile Ala Gly Glu Val Gly Ser Lys Lys Tyr Leu 65 70 75 80 Val Ala Ala Asn Leu Lys Lys Pro Ile Leu Leu Pro Ser Trp Ile Lys 85 90 95 Thr Leu Trp Glu Lys Ser Gln Glu Lys Lys Ile Thr Arg Tyr Thr Asp 100 105 110 Ile Asn Met Glu Asp Phe Lys Cys Pro Ile Phe Leu Gly Cys Ile Ile 115 120 125 Cys Val Thr Gly Leu Cys Gly Leu Asp Arg Lys Glu Val Gln Gln Leu 130 135 140 Thr Val Lys His Gly Gly Gln Tyr Met Gly Gln Leu Lys Met Asn Glu 145 150 155 160 Cys Thr His Leu Ile Val Gln Glu Pro Lys Gly Gln Lys Tyr Glu Cys 165 170 175 Ala Lys Arg Trp Asn Val His Cys Val Thr Thr Gln Trp Phe Phe Asp 180 185 190 Ser Ile Glu Lys Gly Phe Cys Gln Asp Glu Ser Ile Tyr Lys Thr Glu 195 200 205 Pro Arg Pro Glu Ala Lys Thr Met Pro Asn Ser Ser Thr Pro Thr Ser 210 215 220 Gln Ile Asn Thr Ile Asp Ser Arg Thr Leu Ser Asp Val Ser Asn Ile 225 230 235 240 Ser Asn Ile Asn Ala Ser Cys Val Ser Glu Ser Ile Cys Asn Ser Leu 245 250 255 Asn Ser Lys Leu Glu Pro Thr Leu Glu Asn Leu Glu Asn Leu Asp Val 260 265 270 Ser Ala Phe Gln Ala Pro Glu Asp Leu Leu Asp Gly Cys Arg Ile Tyr 275 280 285 Leu Cys Gly Phe Ser Gly Arg Lys Leu Asp Lys Leu Arg Arg Leu Ile 290 295 300 Asn Ser Gly Gly Gly Val Arg Phe Asn Gln Leu Asn Glu Asp Val Thr 305 310 315 320 His Val Ile Val Gly Asp Tyr Asp Asp Glu Leu Lys Gln Phe Trp Asn 325 330 335 Lys Ser Ala His Arg Pro His Val Val Gly Ala Lys Trp Leu Leu Glu 340 345 350 Cys Phe Ser Lys Gly Tyr Met Leu Ser Glu Glu Pro Tyr Ile His Ala 355 360 365 Asn Tyr Gln Pro Val Glu Ile Pro Val Ser His Gln Pro Glu Ser Lys 370 375 380 Ala Ala Leu Leu Lys Lys Lys Asn Ser Ser Phe Ser Lys Lys Asp Phe 385 390 395 400 Ala Pro Ser Glu Lys His Glu Gln Ala Asp Glu Asp Leu Leu Ser Gln 405 410 415 Tyr Glu Asn Gly Ser Ser Thr Val Val Glu Ala Lys Thr Ser Glu Ala 420 425 430 Arg Pro Phe Asn Asp Ser Thr His Ala Glu Pro Leu Asn Asp Ser Thr 435 440 445 His Ile Ser Leu Gln Glu Glu Asn Gln Ser Ser Val Ser His Cys Val 450 455 460 Pro Asp Val Ser Thr Ile Thr Glu Glu Gly Leu Phe Ser Gln Lys Ser 465 470 475 480 Phe Leu Val Leu Gly Phe Ser Asn Glu Asn Glu Ser Asn Ile Ala Asn 485 490 495 Ile Ile Lys Glu Asn Ala Gly Lys Ile Met Ser Leu Leu Ser Arg Thr 500 505 510 Val Ala Asp Tyr Ala Val Val Pro Leu Leu Gly Cys Glu Val Glu Ala 515 520 525 Thr Val Gly Glu Val Val Thr Asn Thr Trp Leu Val Thr Cys Ile Asp 530 535 540 Tyr Gln Thr Leu Phe Asp Pro Lys Ser Asn Pro Leu Phe Thr Pro Val 545 550 555 560 Pro Val Met Thr Gly Met Thr Pro Leu Glu Asp Cys Val Ile Ser Phe 565 570 575 Ser Gln Cys Ala Gly Ala Glu Lys Glu Ser Leu Thr Phe Leu Ala Asn 580 585 590 Leu Leu Gly Ala Ser Val Gln Glu Tyr Phe Val Arg Lys Ser Asn Ala 595 600 605 Lys Lys Gly Met Phe Ala Ser Thr His Leu Ile Leu Lys Glu Arg Gly 610 615 620 Gly Ser Lys Tyr Glu Ala Ala Lys Lys Trp Asn Leu Pro Ala Val Thr 625 630 635 640 Ile Ala Trp Leu Leu Glu Thr Ala Arg Thr Gly Lys Arg Ala Asp Glu 645 650 655 Ser His Phe Leu Ile Glu Asn Ser Thr Lys Glu Glu Arg Ser Leu Glu 660 665 670 Thr Glu Ile Thr Asn Gly Ile Asn Leu Asn Ser Asp Thr Ala Glu His 675 680 685 Pro 25 194 PRT Drosophila sp. 25 Glu Lys Arg Ala Glu Leu Ile Ala Arg Ile Thr Gln Leu Gly Gly Lys 1 5 10 15 Val Cys Glu Asn Leu Val Asn Tyr Asp Asp Ser Cys Thr His Leu Leu 20 25 30 Cys Glu Arg Pro Asn Arg Gly Glu Lys Met Leu Ala Cys Ile Ala Ala 35 40 45 Gly Lys Trp Ile Leu Asn Ile Gln Tyr Ile Glu Gln Ser His Ala Arg 50 55 60 Gly Asp Phe Leu Asp Glu Thr Leu Tyr Glu Trp Gly Asn Pro Lys Ala 65 70 75 80 Ile Asn Leu Pro Thr Leu Ala Pro Glu Glu Glu Pro Ile Ala Ala Ala 85 90 95 Val His Arg Trp Arg Thr Glu Leu Ser Ala Cys Gly Gly Gly Ala Phe 100 105 110 Ser Asp His Arg Val Ile Leu Ser Met Asn Glu Arg Ser Gly Ala Pro 115 120 125 Ile Arg Asn Val Leu Arg Ala Gly Gly Ala Cys Ile Leu Glu Pro Thr 130 135 140 Thr Pro Phe Ser Lys Asp Pro Val Ala Lys Ser Ala Ser His Cys Phe 145 150 155 160 Val Asp Val Lys Lys Ala Pro Leu Ser Thr Gln Asp Met Glu Tyr Leu 165 170 175 His Lys Cys Gly Val Gln Val Leu Ser Gln Ile Ala Ile Asn Ala Tyr 180 185 190 Leu Met 26 187 PRT Homo sapiens 26 Gln Glu Arg Ile Asp Tyr Cys His Leu Ile Glu Lys Leu Gly Gly Leu 1 5 10 15 Val Ile Glu Lys Gln Cys Phe Asp Pro Thr Cys Thr His Ile Val Val 20 25 30 Gly His Pro Leu Arg Asn Glu Lys Tyr Leu Ala Ser Val Ala Ala Gly 35 40 45 Lys Trp Val Leu His Arg Ser Tyr Leu Glu Ala Cys Arg Thr Ala Gly 50 55 60 His Phe Val Gln Glu Glu Asp Tyr Glu Trp Gly Ser Ser Ser Ile Leu 65 70 75 80 Asp Val Leu Thr Gly Ile Asn Val Gln Gln Arg Arg Leu Ala Leu Ala 85 90 95 Ala Met Arg Trp Arg Lys Lys Ile Gln Gln Arg Gln Glu Ser Gly Ile 100 105 110 Val Glu Gly Ala Phe Ser Gly Trp Lys Val Ile Leu His Val Asp Gln 115 120 125 Ser Arg Glu Ala Gly Phe Lys Arg Leu Leu Gln Ser Gly Gly Ala Lys 130 135 140 Val Leu Pro Gly His Ser Val Pro Leu Phe Lys Glu Ala Thr His Leu 145 150 155 160 Phe Ser Asp Leu Asn Lys Leu Lys Pro Asp Asp Ser Gly Val Asn Ile 165 170 175 Ala Glu Ala Ala Ala Gln Asn Val Tyr Cys Leu 180 185 27 18 DNA Artificial Sequence Description of Artificial Sequence Primer 27 ccgaagctat cgctaggt 18 28 18 DNA Artificial Sequence Description of Artificial Sequence Primer 28 agtcccacgc gcatgcga 18 29 18 DNA Artificial Sequence Description of Artificial Sequence Primer 29 gacgatagct gcacccat 18 30 18 DNA Artificial Sequence Description of Artificial Sequence Primer 30 gtggcggcgc tgcttcga 18 31 18 DNA Artificial Sequence Description of Artificial Sequence Primer 31 gaaggcgccg ctgtcgac 18 32 18 DNA Artificial Sequence Description of Artificial Sequence Primer 32 cgcacctctc ggatctct 18 33 18 DNA Artificial Sequence Description of Artificial Sequence Primer 33 ccacagcttc aagcagcc 18 34 18 DNA Artificial Sequence Description of Artificial Sequence Primer 34 aggcaggtga tgaggcac 18 35 18 DNA Artificial Sequence Description of Artificial Sequence Primer 35 gtcttcttgc tgtccggc 18 36 18 DNA Artificial Sequence Description of Artificial Sequence Primer 36 accgttggca acgctggc 18 37 18 DNA Artificial Sequence Description of Artificial Sequence Primer 37 cgccatgtgg tcaccgaa 18 38 18 DNA Artificial Sequence Description of Artificial Sequence Primer 38 tgtgtggtgg tgaccaac 18 39 18 DNA Artificial Sequence Description of Artificial Sequence Primer 39 ctcttcgctt tggtttag 18 40 18 DNA Artificial Sequence Description of Artificial Sequence Primer 40 gccagcgttg ccaacggt 18 41 18 DNA Artificial Sequence Description of Artificial Sequence Primer 41 ttagcttctc acgctcct 18 42 18 DNA Artificial Sequence Description of Artificial Sequence Primer 42 acgatgatag atgcctcc 18 43 18 DNA Artificial Sequence Description of Artificial Sequence Primer 43 tggcaccaat aagccttg 18 44 18 DNA Artificial Sequence Description of Artificial Sequence Primer 44 gagctcctga actgacgc 18 45 18 DNA Artificial Sequence Description of Artificial Sequence Primer 45 tcgcgattcg cagttctt 18 46 18 DNA Artificial Sequence Description of Artificial Sequence Primer 46 gtcgacagcg gcgccttc 18 47 18 DNA Artificial Sequence Description of Artificial Sequence Primer 47 tagcgtaatg aattacta 18 48 13 DNA Artificial Sequence Description of Artificial Sequence Consensus sequence 48 cacaaccaaa atg 13 

1. A polynucleotide encoding mus101 or a homologue thereof.
 2. A polynucleotide selected from: (a) polynucleotides comprising the nucleotide sequence set out in SEQ ID No. 1 or the complement thereof. (b) polynucleotides comprising a nucleotide sequence capable of hybridising to the nucleotide sequence set out in SEQ ID No. 1, or a fragment thereof. (c) polynucleotides comprising a nucleotide sequence capable of hybridising to the complement of the nucleotide sequence set out in SEQ ID No. 1 or a fragment thereof. (d) polynucleotides comprising a polynucleotide sequence which is degenerate as a result of the genetic code to the polynucleotides defined in (a), (b) or (c).
 3. A polynucleotide probe which comprises a fragment of at least 15 nucleotides of a polynucleotide as defined in claim 1 or
 2. 4. A polypeptide which comprises the sequence set out in SEQ ID No 2, or a homologue, variant, derivative or fragment thereof.
 5. A polynucleotide encoding a polypeptide according to claim
 4. 6. A vector comprising a polynucleotide as defined in any one of claims 1, 2 or
 5. 7. An expression vector comprising a polynucleotide as defined in any one of claims 1, 2 or 5, operably linked to a regulatory sequence capable of directing expression of said polynucleotide in a host cell.
 8. An antibody capable of binding the polypeptide of SEQ ID. No. 2 or a fragment thereof.
 9. A method for detecting the presence or absence of a polynucleotide as defined in any one of claims 1, 2 or 5 in a biological sample which comprises: (a) bringing the biological sample containing DNA or RNA into contact with a probe according to claim 3 under hybridising conditions; and (b) detecting any duplex formed between the probe and nucleic acid in the sample.
 10. A method for detecting a polypeptide as defined in claim 4 present in a biological sample which comprises: (a) providing an antibody according to claim 8; (b) incubating a biological sample with said antibody under conditions which allow for the formation of an antibody-antigen complex; and (c) determining whether antibody-antigen complex comprising said antibody is formed.
 11. A polynucleotide according to any one of claims 1, 2 or 5 for use in therapy.
 12. A polypeptide according to claim 4 for use in therapy.
 13. An antibody according to claim 9 for use in therapy.
 14. Use of a mus101 polypeptide or homologue, derivative, variant or fragment thereof in a method of identifying a substance capable of affecting mus101 function.
 15. Use of a mus101 polypeptide or homologue thereof, or fragment in an assay for identifying a substance capable of increasing the susceptibility of a cell to DNA damaging agents.
 16. A method for identifying a substance capable of binding to a mus101 polypeptide or a homologue, derivative, variant or fragment thereof, which method comprises incubating the mus101 polypeptide or homologue, derivative, variant or fragment thereof with a candidate substance under suitable conditions and determining whether the substance binds to the mus101 polypeptide or homologue, derivative, variant or fragment thereof.
 17. A substance identified by the method of claim 14, 15 or
 16. 18. A substance according to claim 17 for use in a method of inhibiting mitosis.
 19. A substance according to claim 17 for use in a method of increasing the susceptibility of a cell to DNA damaging agents.
 20. A process comprising the steps of: (a) performing the method according to claim 14, 15 or 16; and (b) preparing a quantity of those one or more substances identified as being capable of binding to a mus101 polypeptide or homologue, derivative, variant or fragment thereof.
 21. A process comprising the steps of: (a) performing the method according claim 14, 15 or 16; and (b) preparing a pharmaceutical composition comprising one or more substances identified as being capable of binding to a mus101 polypeptide or homologue, derivative, variant or fragment thereof.
 22. A method of treating a tumour which method comprises administering to a patient in need of treatment an effective amount of a substance capable of affecting mus101 function.
 23. A method of increasing the susceptibility of a tumour cell to a DNA damaging agent which method comprises administering to said cell a substance capable of affecting mus101 function. 